• What are vector databases?

    The rise of large language models have created demand for a different type of databases. Instead of data, we are now interested in storing and searching vectors.
  • Accessible Large Language Models (LLMs) using LangChain

    Large language models are getting more accessible every day with reduced resource requirements and easy to use wrapper libraries. Langchain is one of the leading ones to get started with Python and language models.
  • Can you stop large language models training on your data?

    Throwing huge amounts of data into a very large neural network and then trying to remove information is like finding a needle in a black whole. We need a simple and effective way to address training data collection.
  • GPT-4 as explained by GPT-4

    OpenAI has released its latest large language model, GPT 4. This is its explanation of what it is.
  • Legal issues for ChatGPT and Bing Chat round the corner

    ChatGPT and Bing Chat are becoming increasingly popular, but it will not be long before people start asking whether it is legal to scrape and commercialise their data.
  • Bing with ChatGPT - Is the future of internet search here?

    Microsoft has struck first in an ongoing AI arms race by integrating the latest generative language model ChatGPT into their search engine Bing. First impressions look very good.
  • Semafind - A modern knowledge management platform with semantic search

    Semafind aims to help teams and businesses store, share and search through their knowledge with the aid of natural language understanding.
  • How to use Docker containers in GitHub actions (CI)?

    Installing dependencies on base images that GitHub actions offers can be clumsy and time consuming. Most build tools and environments come with official Docker images that we can directly utilise.
  • How to split a Python requirements file?

    Sometimes it is beneficial to have different requirements file in a project depending on the runtime environment. This post shows how to easily maintain multiple requirements.txt files.
  • How to setup and connect Firebase emulators?

    Firebase emulators allow you to develop locally and perform unit tests. It is very easy to run and connect to them.
  • How to setup a Raspberry Pi using SSH without a monitor?

    SSH is by default disabled on a new image of Raspberry Pi for security reasons. But using the Raspberry Pi Imager, you can easily configure the initial setup of the Raspberry Pi OS and enable SSH.
  • Typescript interfaces versus classes, which one to use?

    Both interfaces and classes typescript provide a typed view of objects available in javascript. In some cases they look very similar to each other. So which one should you use?
  • 5 tips for training neural networks

    Across many projects I've worked with, I've noticed some simple and effective steps for training neural networks to reduce frustration. I would recommend to follow these tips and tricks when training deep neural network models.
  • Imperial College Machine Learning - Neural Networks

    There has been a lot of buzz surrounding *neural networks* in recent years with achievements made throughout various domains such as computer vision and natural language processing. But what on Earth are they? In these lecture notes and corresponding lectures we will explore, investigate and dissect some of the ideas behind neural networks.
  • Why do some UK laptops come with a US ANSI keyboard layout?

    There seems to be a trend amongst manufacturers to ship ANSI keyboards as standard UK keyboards on new laptops which is very frustrating because what they are doing is incorrect.
  • Recipe for writing a PhD thesis

    Writing a large report or a thesis is very daunting task. In this post, I talk about how to write a PhD thesis and what each chapter should include.
  • How to setup VS Code to compile LaTeX using Docker containers?

    Compiling LaTeX is very cumbersome and requires a lot of modules to install. While online apps have eased the stress of working with LaTeX, sometimes the best solution is to work locally. Using VS Code and its support for containers, I will look into how we can compile LaTeX projects using Docker containers.
  • Remote Working for Imperial Computing Students

    With remote working becoming more important, in these short tutorials I will be covering how to effeciently work remotely for Imperial College Department of Computing students. I provide some pointers to main topics and encourage you to explore further.
  • How to generate plots using unit tests in Python?

    This elegant Python library called unitreport allows you to use matplotlib inside unit tests to generate self-contained HTML reports. It creates a robust, modular and self-contained approach for analysing datasets, models and more.
  • The Neuro-Symbolic Conundrum

    There is increasing interest in neuro-symbolic methods that combine recent advances in deep learning with symbolic methods of Artifical Intelligence. Yet, at the heart of their integration lies an intriguing puzzle.
  • How to handle logging in Python?

    Let your application tell you what is going on at critical steps with effective usage of logging. In this post, I talk over a simple setup for using the logging library in Python.
  • Where does Pylint look for configuration files?

    Pylint is a popular linting tool for the Python programming language, and often it needs to be configured. Pylint looks at several different locations to load its configuration.
  • The reality of accessible deep learning

    Thanks to recent libraries such as TensorFlow and PyTorch, deep learning has become so accessible that it now can be in anyone's toolbox. But at what cost does these libraries reduce the entry level for advanced methods?
  • Why I switched from Vim to Visual Studio Code?

    I have been using Vim for almost a decade now and this new year I decided to switch to Visual Studio Code. But why?
  • How to setup VS Code remote SSH with a jump host?

    The VS Code remote SSH extension is great for working on a remote machine but by default it allows for direct access to the remote host. In this short info post, I will look at how to setup VS Code remote SSH extension to use a jump host.
  • C Recap for Pintos

    A short recap of the C programming language with a Pintos perspective as a memory refresher. We go over topics such as preprocessor directives, pointers and linked list structure in Pintos.
  • How to hash a dictionary in Python?

    You might notice that the in-built Python hash function does not work with dictionaries. That's for good reason because it can be inconsistent across platforms. In this post, I talk about a simple method using standard libraries to hash a Python dictionary in a more stable manner.
  • Argparse with multiple files to handle configuration in Python

    There many methods for handling configuration files within a project and it can be difficult to find a solution that works well. In this post, I talk about a small library that allows argparse to work well across multiple files while still providing the expected argparse features such a help pages.
  • How to discover and run unit tests programmatically in Python?

    Sometimes you might want to run unittests from another Python script and gather its results instead of running `python3 -m unittest` manually. Running test cases programmatically is quite easy and gives you a lot of control on what happens during or after running the test cases.
  • Building Spiking Neural Networks

    Spiking neural networks work by simulating the membrane potential of biological neurons. Unlike artificial neural networks spiking ones attempt to model the biological neurons that build up our brains. So while ANNs are inspired by biological brains, SNNs try to create them.
  • How does Google Pregel work?

    MapReduce requires the data chunks to be processed independently. This processing model is unsuitable for many graph models in which a calculation often requires knowledge about calculations done for other nodes. This issue is exactly what Pregel tackles with message passing.
  • How to handle configuration in Python?

    Configuration files can get messy when dealing with a large number of external parameters. The built-in solutions like `argparse` might not be scalable or clean enough to manage external parameters. In this post I look at an alternative YAML based solution that implicitly passes configuration to functions.
  • Keras Deep Learning 101

    A very quick introduction to Keras and common architecture types with short examples. The slides include code snippets and tips on training neural networks.
  • Project Workshops

    I cover useful information from running batch jobs to introduction for TensorFlow and Keras. These are aimed at students who have machine learning based projects.
  • How to specify pip install location?

    Virtualenv provides a very good way to isolate packages. But sometimes it is better to install packages shared across projects in a custom location.