The Stack Structure

Posted on

What is the Stack data structure? Why and when to use it?

What is a stack data structure?

A data structure is a way to model code. An the functionality needed from this certain piece of code is to push and pop “things” from a stack. I need this stack to be locked down so that i can only grab the item at the top, and to add pushed items onto the bottom.

I am currently writing a web crawler in PHP and I have come up against this question. I needed to keep a “list” of urls to crawl. If you are interested the code can be found here.

To keep my code organised and modular I decided to create a “stack” class which is solely in charge of pushing and popping information from in a hierarchical fashion. When thinking of this stack of urls I though, what would i like this stacks functionality to be;

Specification

Well I want to push items to it,

  1. pop items off it
  2. check if its empty
  3. I may need to see which item is top of the list, or next to be popped

The End Result

  class Stack {

      protected $stack;
      protected $limit;

      protected function __construct($limit = 10) {
          // initialize the stack property
          $this->stack = array();
          // stack can only contain this many items
          $this->limit = $limit;
      }

      public function push($item) {
          // trap for stack overflow
          if (count($this->stack) < $this->limit) {
              // prepend item to the start of the array
              array_unshift($this->stack, $item);
          } else {
              return false;
          }
      }

      public function pop() {
          if ($this->isEmpty()) {
              // trap for stack underflow
  	      return false;
  	  } else {
              // pop item from the start of the array
              return array_shift($this->stack);
          }
      }

      public function top() {
          return current($this->stack);
      }

      public function isEmpty() {
          return empty($this->stack);
      }
  }

I then thought that I may possibly need two lists in my crawler, one for the urls to crawl (since this stack pops of the top url and never sees it again), and one for the completed urls. Also only one of these lists will need a function to search to see if that ur has been done, so to get around this I will use the Stack class as the base class and develop any list specific functionality onto of that. So i came to the UrlsToCrawlList class which uses the singleton pattern (see my post on the php singleton pattern) to make sure only one UrlsToCrawlList class is ever active at a time. The only parameter for this class is the limit of the size of the stack. This class has the has() method i mentioned which searches to see if we have the item.

class UrlsToCrawlList extends Stack {

    public static function Instance($limit)
    {
        static $inst = null;
        if ($inst === null) {
            $inst = new self($limit);
        }
        return $inst;
    }

    protected function __construct($limit)
    {
        parent::__construct($limit);
    }

    public function has($key)
    {
        return (array_search($key, $this->stack) === false);
    }
}
?>

So in action we have;

<?php
    $queue = UrlsToCrawlList::Instance(10);
    $queue->push('www.something.com');
    $queue->push('www.wfwefwefwe.com');
    var_dump($queue);
?>