Data structures Introduction

INTRODUCTION

Many different data structures are used in computer science to store and organize data in a specific way. Here is a list of some common data structures:

  1. Array: A collection of items stored at contiguous memory locations.

  2. Linked List: A linear data structure where each element is a separate object, connected using pointers.

  3. Stack: A last-in-first-out (LIFO) data structure where elements are added and removed from the top of the stack.

  4. Queue: A first-in-first-out (FIFO) data structure where elements are added to the back and removed from the front.

  5. Tree: A hierarchical data structure where each node has one or more child nodes.

  6. Graph: A non-linear data structure consisting of nodes and edges connecting them.

  7. Hash Table: A data structure that maps keys to values using a hash function.

  8. Heap: A tree-based data structure where the parent nodes are either greater than or less than their children (max heap or min heap).

  9. Set: A collection of elements with no specific order and no duplicate elements.

  10. Map: A data structure that maps keys to values, similar to a dictionary in Python.

Each of these data structures has its own characteristics and is used in different situations depending on the specific requirements of the problem at hand.

Array

An array is a collection of items stored at contiguous memory locations and accessed using an index. The items in an array are usually of the same data type and are stored in a linear fashion.

Here is an example of an array in Python:

Copy code# Declare an array of integers with 5 elements
arr = [1, 2, 3, 4, 5]

# Access an element of the array using its index
print(arr[2])  # Output: 3

# Modify an element of the array using its index
arr[4] = 10
print(arr)  # Output: [1, 2, 3, 4, 10]

# Find the length of the array
print(len(arr))  # Output: 5

Arrays are useful for storing and accessing large amounts of data efficiently, but they have a fixed size, so they are not suitable for situations where the size of the data is not known in advance or may change dynamically.

Linked List

A linked list is a linear data structure where each element is a separate object, connected using pointers. Each element, or node, in a linked list, consists of two fields: a data field to store the element itself, and a NEXT field that contains a reference to the next node in the list. The last node in the list has a reference to null, indicating the end of the list.

Here is an example of a linked list in Python:

Copy code# Define a Node class to represent a node in the linked list
class Node:
    def __init__(self, data=None):
        self.data = data
        self.next = None

# Define a LinkedList class to represent the linked list
class LinkedList:
    def __init__(self):
        self.head = None

    # Method to add a new node to the linked list
    def append(self, data):
        new_node = Node(data)
        if self.head is None:
            self.head = new_node
            return
        current_node = self.head
        while current_node.next is not None:
            current_node = current_node.next
        current_node.next = new_node

    # Method to print the elements of the linked list
    def print_list(self):
        current_node = self.head
        while current_node is not None:
            print(current_node.data, end=' ')
            current_node = current_node.next
        print()

# Create a new linked list and add some elements
ll = LinkedList()
ll.append(1)
ll.append(2)
ll.append(3)
ll.append(4)

# Print the elements of the linked list
ll.print_list()  # Output: 1 2 3 4

Linked lists are useful for situations where the size of the data is not known in advance or may change dynamically since they do not have a fixed size like arrays. However, they are not as efficient as arrays for accessing and manipulating elements at specific indices.

Stack

A stack is a last-in-first-out (LIFO) data structure where elements are added and removed from the top of the stack. It has two main operations: push, which adds an element to the top of the stack, and pop, which removes the element from the top of the stack. The element that is added last is the first one to be removed.

Here is an example of a stack in Python:

Copy code# Define a Stack class with the push and pop methods
class Stack:
    def __init__(self):
        self.items = []

    def push(self, item):
        self.items.append(item)

    def pop(self):
        return self.items.pop()

    def is_empty(self):
        return self.items == []

# Create a new stack and add some elements
stack = Stack()
stack.push(1)
stack.push(2)
stack.push(3)
stack.push(4)

# Remove and print the elements from the stack
while not stack.is_empty():
    print(stack.pop(), end=' ')  # Output: 4 3 2 1

Stacks are useful for situations where you need to maintain a history of events or perform undo/redo operations. They are also used in algorithms such as depth-first search and evaluating prefix and postfix expressions.

Queue

A queue is a first-in-first-out (FIFO) data structure where elements are added to the back and removed from the front. It has two main operations: enqueue, which adds an element to the back of the queue, and dequeue, which removes the element from the front of the queue. The element that is added first is the first one to be removed.

Here is an example of a queue in Python:

Copy code# Define a Queue class with the enqueue and dequeue methods
class Queue:
    def __init__(self):
        self.items = []

    def enqueue(self, item):
        self.items.append(item)

    def dequeue(self):
        return self.items.pop(0)

    def is_empty(self):
        return self.items == []

# Create a new queue and add some elements
queue = Queue()
queue.enqueue(1)
queue.enqueue(2)
queue.enqueue(3)
queue.enqueue(4)

# Remove and print the elements from the queue
while not queue.is_empty():
    print(queue.dequeue(), end=' ')  # Output: 1 2 3 4

Queues are useful for situations where you need to maintain a list of tasks or events and process them in the order they were received. They are also used in algorithms such as breadth-first search and in simulating real-world systems such as a printer queue or a message queue.

Tree

A tree is a hierarchical data structure where each node has one or more child nodes. The node at the top of the tree is called the root, and the nodes below the root are called child nodes. The nodes that have no children are called leaf nodes.

Here is an example of a tree in Python:

Copy code# Define a Node class to represent a node in the tree
class Node:
    def __init__(self, data=None):
        self.data = data
        self.children = []

    def add_child(self, node):
        self.children.append(node)

# Create the root node and add some child nodes
root = Node('A')
root.add_child(Node('B'))
root.add_child(Node('C'))
root.add_child(Node('D'))

# Add some grandchildren to the root's children
root.children[0].add_child(Node('E'))
root.children[0].add_child(Node('F'))
root.children[1].add_child(Node('G'))
root.children[1].add_child(Node('H'))
root.children[2].add_child(Node('I'))

# Print the tree using a depth-first traversal
def print_tree(node, level=0):
    print(' ' * 4 * level + '-' + node.data)
    for child in node.children:
        print_tree(child, level + 1)

print_tree(root)

The output of the above code would be:

-A
    -B
        -E
        -F
    -C
        -G
        -H
    -D
        -I

Trees are useful for storing hierarchical data and for searching and traversing data efficiently. Different types of trees have different properties and are used in different situations. Some common types of trees include binary trees, binary search trees, and red-black trees.

Graphs

A graph is a non-linear data structure consisting of nodes and edges connecting them. A node in a graph is also called a vertex, and an edge is a connection between two nodes.

A graph can be either directed or undirected. In a directed graph, the edges have a direction and are called arcs, and in an undirected graph, the edges have no direction and are called edges.

Here is an example of a graph in Python:

Copy code# Define a Graph class with the add_edge and print_graph methods
class Graph:
    def __init__(self, is_directed=False):
        self.graph = {}
        self.is_directed = is_directed

    def add_edge(self, u, v):
        if u in self.graph:
            self.graph[u].append(v)
        else:
            self.graph[u] = [v]
        if not self.is_directed:
            if v in self.graph:
                self.graph[v].append(u)
            else:
                self.graph[v] = [u]

    def print_graph(self):
        for u in self.graph:
            print(u, '->', ' -> '.join(self.graph[u]))

# Create a new undirected graph and add some edges
g = Graph()
g.add_edge('A', 'B')
g.add_edge('A', 'C')
g.add_edge('B', 'C')
g.add_edge('B', 'D')
g.add_edge('C', 'D')

# Print the graph
g.print_graph()

The output of the above code would be:

Copy codeA -> B -> C
B -> A -> C -> D
C -> A -> B -> D
D -> B -> C

Graphs are useful for representing relationships between data and for solving problems such as finding the shortest path between two nodes or determining whether a graph is connected. Different algorithms and data structures are used for different types of graph problems.

Hash maps

A hash table is a data structure that maps keys to values using a hash function. The hash function takes the key as input and returns an integer, which is used to index into an array of buckets or slots where the value is stored.

The hash function should be designed to distribute the keys evenly across the buckets, so that the average number of keys per bucket is small. This ensures that the hash table has good performance for inserting, deleting, and finding values by key.

Here is an example of a hash table in Python:

Copy codeclass HashTable:
    def __init__(self, size=10):
        self.size = size
        self.table = [[] for _ in range(self.size)]

    def hash_function(self, key):
        return hash(key) % self.size

    def insert(self, key, value):
        hash_key = self.hash_function(key)
        self.table[hash_key].append((key, value))

    def find(self, key):
        hash_key = self.hash_function(key)
        bucket = self.table[hash_key]
        for (k, v) in bucket:
            if k == key:
                return v
        return None

    def delete(self, key):
        hash_key = self.hash_function(key)
        bucket = self.table[hash_key]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                del bucket[i]
                return v
        return None

# Create a new hash table and add some key-value pairs
ht = HashTable()
ht.insert('apple', 100)
ht.insert('banana', 200)
ht.insert('orange', 300)

# Find the value for a key
print(ht.find('apple'))  # Output: 100
print(ht.find('banana'))  # Output: 200
print(ht.find('orange'))  # Output: 300

# Delete a key-value pair
ht.delete('banana')
print(ht.find('banana'))  # Output: None

Hash tables are useful for storing and accessing data quickly, especially when the size of the data is large. They are used in various applications such as databases, caches, and hash-based data structures such as sets and maps. However, hash tables can have poor performance if the keys are not distributed evenly across the buckets or if the hash function is not well designed.

Heaps

A heap is a tree-based data structure where the parent nodes are either greater than or less than their children (max heap or min heap). In a max heap, the value of the parent node is always greater than or equal to the values of its children, and in a min heap, the value of the parent node is always less than or equal to the values of its children.

The root node of the heap is the node with the maximum (max heap) or minimum (min heap) value. Heaps are typically implemented as arrays, with the child nodes of the node at index i located at indices 2i+1 and 2i+2.

Here is an example of a max heap in Python:

Copy codeclass MaxHeap:
    def __init__(self):
        self.heap = []

    def push(self, item):
        self.heap.append(item)
        self.heapify_up()

    def pop(self):
        if len(self.heap) == 0:
            return None
        if len(self.heap) == 1:
            return self.heap.pop()
        root = self.heap[0]
        self.heap[0] = self.heap.pop()
        self.heapify_down()
        return root

    def heapify_up(self):
        idx = len(self.heap) - 1
        while idx > 0:
            parent_idx = (idx - 1) // 2
            if self.heap[idx] <= self.heap[parent_idx]:
                return
            self.heap[idx], self.heap[parent_idx] = self.heap[parent_idx], self.heap[idx]
            idx = parent_idx

    def heapify_down(self):
        idx = 0
        while True:
            left_child_idx = 2 * idx + 1
            right_child_idx = 2 * idx + 2
            largest_idx = idx
            if left_child_idx < len(self.heap) and self.heap[left_child_idx] > self.heap[largest_idx]:
                largest_idx = left_child_idx
            if right_child_idx < len(self.heap) and self.heap[right_child_idx] > self.heap[largest_idx]:
                largest_idx = right_child_idx
            if largest_idx == idx:
                return
            self.heap[idx], self.heap[largest_idx] = self.heap[largest_idx], self.heap[idx]
            idx = largest_idx

# Create a new max heap and add some elements
heap = MaxHeap()
heap.push(10)
heap.push(5)
heap.push(3)
heap.push(7)
heap.push(1)

# Remove and print the elements from the heap
while heap.heap:
    print(heap.pop(), end=' ')  # Output: 10 7 5 3 1

Map

A map, also known as a dictionary, is a data structure that stores a mapping of keys to values. Maps are often implemented using hash tables or search trees, but they can also be implemented using other data structures.

In a map, each key is unique and is used to retrieve the corresponding value. The values in a map can be of any data type.

Here is an example of a map in Python:

Copy code# Create a new map and add some key-value pairs
m = {}
m['apple'] = 100
m['banana'] = 200
m['orange'] = 300

# Find the value for a key
print(m['apple'])  # Output: 100

# Modify the value for a key
m['apple'] = 150
print(m['apple'])  # Output: 150

# Delete a key-value pair
del m['banana']
print(m)  # Output: {'apple': 150, 'orange': 300}

# Check if a key is in the map
print('apple' in m)  # Output: True
print('banana' in m)  # Output: False

Maps are useful for storing and accessing data efficiently using keys, and they are often used as an associative array