Do they belong together?

cloud of points

Points are not only the most primitive datatype in computational geometry, but they are the most commonly used ones. If you need a quick overview of points, you can check this post. They are most commonly used in communicating the location of an event (the location of purchase, pickup point, a patient, etc…). But Whenever the count of these points became large enough, it would be hard to visualize them or make a pattern out of them. So sadly they may become useless. But the good news is that they can become more valuable than ever if you know what to do.

So if you cannot understand them, cluster them i.e. separate points into groups. But how? We will see later in this post. In many cases, clusters (groups) are easier to understand and reason about. For example, clustering students’ points into groups to assign a school bus for each group or performing suitability analysis to choose the best distribution for new hospitals given the residences of a city. Enough chitchat, let’s see.

Cluster Formation

First and foremost, we need to capture the points we are dealing with, If we are lucky (as I assume that we are) these points would fit in memory for processing. We also need to perform some operations on these newly acquired points like actually cluster them for instance, get the centroid of a specific cluster, and Compare a set of centroids against others. These operations can be abstracted in the header file below.

  • All code snippets of this blog can be found here
  • We are building on our point classes here
  • If you like this post, share, and give the repo a star
#ifndef POINT_POINTCLOUD_H
#define POINT_POINTCLOUD_H
#include "../points/point.h"
#include <iostream>
class pointCloud {
public:
    pointCloud();
    explicit pointCloud(vector<point *> &points);
    vector<vector<point *>> cluster(int clusters);
    double diameter();
    string asWKT();
    static point * getCentroid(const vector<point *> & pointCluster);
    static bool isSameCentroids(const vector<point *> & a,const vector<point *> & b);
    static void PrintClusters(const vector<vector<point *>> &clusters);
private:
    vector<point *> points;
    void assignPointsToCentroid(vector<point *> &centroids, vector<vector<point *>> &ps);
};
#endif //POINT_POINTCLOUD_H

So as we can see, the class constructor accepts a vector of points as an input to be used later. The Cluster method can be invoked to split the points into groups. The diameter method gets the diameter of the point cloud. asWKT returns a WKT representation of the point cloud. As well as a couple of static methods whose functions are quite clear.

So, without further ado, let’s dive into the constructor. As shown in the code snippet below, the constructor is quite simple. It simply stores the vector of points into the points property.

It is worth noting that this algorithm has a time complexity of O(t*k*n*d) where

  • t: number of iterations nade (depends on the initialization)
  • k: number of clusters required
  • n: number of points
  • d: dimensionality of the points

Which can be close to quadratic O(n2) in most cases. Although they are many more efficient implementations that go to linear complexity. check this one for example.

pointCloud::pointCloud(vector<point *> &points) {
    this->points = points;
}

The Clustering Logic

To actually cluster them we will use a simple yet intuitive algorithm called “k-means”. This algorithm takes an input value other than the points to be clustered. This input is the number of clusters to be generated denoted as n.

  1. The algorithm starts by selecting n points randomly (or preferably based on some heuristic) and assumes that these points are the perfect set of centroids.
  2. Then we assign each of the points in our point cloud to one of the cluster centroids based on euclidean distance.
  3. Then we recalculate the centroid for each cluster of points.
  4. If the old set of cluster centroid matches the newly acquired one, we know that this is the actual perfect set of clusters. So we return.
  5. If not we loop to step 2 until convergence.

This algorithm is guaranteed to be finite, although the full proof is out of this post’s scope. Listed below is a simple straightforward implementation of this algorithm.

vector<vector<point *>> pointCloud::cluster(int clusterCount) {
    //initially assume any points as centroids
    vector<point *> centroids = vector<point *>(points.begin(),points.begin()+clusterCount);
    vector<vector<point *>> clusters = vector<vector<point *>>(clusterCount);
    while (true){
        //assign points to centroids
        clusters = vector<vector<point *>>(clusterCount);
        assignPointsToCentroid(centroids,clusters);
        vector<point *> newCentroids ;
        //update centroids
        for (int i = 0; auto v : clusters) {
            newCentroids.push_back(getCentroid(v));
            i++;
        }
        //break when centroids match prev iteration
        if(isSameCentroids(newCentroids,centroids))break;
        else{
            centroids = newCentroids;
        }
    }
    return clusters;
}

void pointCloud::assignPointsToCentroid(vector<point *> &centroids, vector<vector<point *>>&ps) {
    for (int i = 0; i < points.size(); ++i) {
        float bestDistance = INTMAX_MAX;
        int bestCentroid = 0;
        //find the best cluster for it
        for (int j = 0;j<centroids.size();j++) {
            float distance = centroids[j]->distance(points[i]);
            if(bestDistance > distance){
                bestDistance = distance;
                bestCentroid = j;
            }
        }
        ps[bestCentroid].push_back(points[i]);
    }
}

bool pointCloud::isSameCentroids(const vector<point *> &a,const vector<point *> &b) {
    if(a.size()!=b.size())return false;
    for (int i = 0; i < a.size(); ++i) {
        if(a[i]->inProximity(0.000001,b[i]))continue;
        else return false;
    }
    return true;
}

To make this class testable, we need to provide some methods to print the input cloud of points as WKT (friendly format in most of the computational geometry platforms such as ARC desktop or QGIS) and it goes like this:

string pointCloud::asWKT() {
    string s = "GeometryCollection(";
        for(point * p : points)
            s+=p->asWKT()+",";
        s = s.substr(0,s.size()-1);
    s+=")";
    return s;
}

And we need also to print the out clusters, which is a trivial implementation:

void pointCloud::PrintClusters(const vector<vector<point *>> &clusters) {
    for (int i = 0; i < clusters.size(); ++i) {
        cout<< string(5,'-')<<"cluster \""<<i<<"\" points"<< string(5,'-')<<endl;
        for (point * p : clusters[i])
            cout<<p->asWKT()<<endl;
    }
}

See it working

To try out this simple algorithm we need to perform a set of steps:

  1. Create a list of random points
  2. Use this list to form a point cloud instance
  3. Print the representation of this point cloud
  4. Visualize it
  5. Cluster the point cloud
  6. Visualize the clusters

The code to perform this task goes al follows:

#include <iostream>
#include "points/point.h"
#include "pointCloud/pointCloud.h"
#include <unordered_set>
int main() {
    vector<point *> points;
    //make 10 2D points
    for (int i = 0; i < 10; ++i) {
        vector<float> coords;
        float randx = (float)(rand()%100)/100.0;
        float randy = (float)(rand()%100)/100.0;
        coords.push_back(randx);
        coords.push_back(randy);
        points.push_back(new point(coords));
    }
    //make point cloud
    pointCloud * pc = new pointCloud(points);
    cout<<pc->asWKT()<<endl;
    //cluster them into 2 clusters
    auto clusters = pc->cluster(3);
    pointCloud::PrintClusters(clusters);
    return 0;
}

The output of this program execution was as follows:

GeometryCollection(POINT(0.070000 0.490000 ),POINT(0.730000 0.580000 ),POINT(0.300000 0.720000 ),POINT(0.440000 0.780000 ),POINT(0.230000 0.090000 ),POINT(0.400000 0.650000 ),POINT(0.920000 0.420000 ),POINT(0.870000 0.030000 ),POINT(0.270000 0.290000 ),POINT(0.400000 0.120000 ))
-----cluster "0" points-----
POINT(0.070000 0.490000 )
POINT(0.230000 0.090000 )
POINT(0.270000 0.290000 )
POINT(0.400000 0.120000 )
-----cluster "1" points-----
POINT(0.730000 0.580000 )
POINT(0.920000 0.420000 )
POINT(0.870000 0.030000 )
-----cluster "2" points-----
POINT(0.300000 0.720000 )
POINT(0.440000 0.780000 )
POINT(0.400000 0.650000 )

The visualization of this point cloud was done using QGIS software and a quickWKT extension.

Randomly Generated points

The output of our clustering algorithm can be visualized as follows:

After Clustering

In the last figure, each color represents a cluster.

Points

Points are the simplest and the smallest geometrical entity that can be defined. So a point cannot be broken down into smaller parts and it will always occupy the space of zero. Points do not have any width, or length either. So what are points?

Points are the basic construction unit of any geometrical object, and they can be represented as a vector of coordinates. Each coordinate of this vector represents the distance between this point and the corresponding axis I.e. the point {1,3} distance from the x axis is 1 and 3 from the y axis.

any code snippets can be found here.

I”m going to make the coding part in c++ but you can follow up in any language, with that said, this is our header file:

#ifndef POINT_POINT_H
#define POINT_POINT_H
#include <vector>
#include <string>
using namespace std;
class point {
public:
    point(vector<float> &coords);
    //geojson or wkt
    point(string enc);
    float distance(point * p2);
    vector<float> coords();
    bool inProximity(float tolerance, point * p);
    string asGeojson();
    string asWKT();
private:
    vector<float>coordinates;
    void fromGeojson(string p);
    void fromWKT(string p);
};
#endif //POINT_POINT_H

Coordinates of a point

The order (size) of a vector representing a point must equal to the number of dimensions in the space in which we are considering this point i.e a point in a single dimension can only have one number p={x} as it’s coordinate, in 2d space that would be the familiar p={x,y}, in 3d that would be p={x,y,z} and so on on hyper dimensional space.

It is actually important to note that points cannot exists in zero or negative dimensions.

this would make up our constructor:

vector<float> point::coords() {
    return coordinates;
}

Operations on points

Points are points, they do not width, length, space, orientation so there is only very little we can do using points alone. So we will start by discussing distances between points in this post, and in the upcoming post we will discuss points cloud and point clusters. Simple as they are, points are essential to understand as every other geometrical object will depend on it, and have relations to points (along, inside, on parameter, outside, centroids, …)

points can be created of course, as we have discussed before, we need coordinates to create a point and that is it.

points have distances among them, and this is the only operation in this post.

distance between two points in 1D plain is simple just get the absolute differences between their coordinates for example the distance between p0={3}, p1={5} is simply 2 abs(3-5);

in 2D the formula becomes a little more complicated, it will involve the use Pythagoras theorem. In order to calculate the distance between two points, we will need to form a virtual right angle triangle between those points as shown blew:

this triangle will have a base of abs(p1[0] – p0[0]) and a hight of abs(p1[1] – p0[1]) thus naturally the distance between p0 and p1 is given by sqrt(width^2 – hight^2).

So we can deduce a more general formula for calculating the distance between two points as follows:

float point::distance(point *p2) {
if(coordinates.size()!=p2->coordinates.size())return -1;
double sum = 0;
for (int i = 0 ; i<coordinates.size();i++) {
sum += pow(abs(coordinates[i]-p2->coordinates[i]),2);
return sqrt(sum);
}
}

we can build up on this function another one that is fairly used too, this function checks if the two points are in close proximity to each other (we may require that the caller should specify the proximity tolerance), anyway the proximity function simply checks if the distance between two points are within a given tolerance as follows:

bool point::inProximity(float tolerance, point * p) {
    float dist= distance(p);
    if(dist<tolerance)return true;
    return false;
}

Next posts will cover points on spheres (like earth) and projections.

Representations of a point

perhaps the most famous representations of a point are the WKT and GeoJSON formats:

and they are quite simple, for example the GeoJSON format is as follows:

{
    "type": "Feature",
    "geometry": {
        "type": "Point",
        "coordinates": [
            x,
            y,
            z
        ]
    }
}

the generation of a geoJSON representation of a point, is quite straight forward:

string point::asGeojson() {
    string geojson =  "{\"type\":\"Feature\",\"geometry\":{\"type\":\"Point\",\"coordinates\":[";
    for(float c : coordinates){
        geojson+= to_string(c) + ",";
    }
    geojson.substr(0, geojson.size()-1);
    geojson +="]}}";
    return geojson;
}

while the wet representation is as follows:

POINT( x y z ...)

and the code generator function look like this:

string point::asWKT() {
    string wkt = "POINT(";
    for(float c : coordinates){
        wkt+= to_string(c) + " ";
    }
    wkt +=")";
    return wkt;
}

I’ll leave the point constructor that parses GeoJSON or WKT as an exercise for the user.

Introduction to computational geometry

Computational geometry has a wide range of applications and yet it is rarely discussed in tutorials. So I’m starting this series as an introduction to this interesting topic. I’m going to start from the very basics like point representation, standard formats, simple operations to complex topics like hyper plane routing in convex hulls. The rest of the post is organized as follows:

  • What is Computational geometry and it’s applications
  • Standard representations of computational geometry objects
  • Types of simple geometries
  • Simple operations on geometrical objects

What is Computational geometries

Computational geometry is a branch of computer science concerned with the representation, standardization, and making operations on geometrical objects like road networks, classification of point clouds or building simple models like human skeletons. Rarely any application that requires simulation of models such as building games, tracking, spacial data analysis or viewing customized maps does not depend heavily on computational geometry.

Standard Representations of Computational Geometries

There are several standards when it comes to computational geometries. Each of which is best fit a different situation. For example the representation used to serve a map can be quite different than the one used to communicate data throw an API and neither are suited for performing database operations.

For example most databases are familiar with both the WKT and WKB, While GEOJSON format is an API friendly. For map viewing there are several protocols such as WFS and MVT for serving vector data and WMS, WTMS and TMS are widely used for serving raster images. For serving large data files the commonly used formats are SHP, KML, etc….

Upcoming posts will explore each and everyone of these techniques as well as comparisons and implementations. keep tuned.

Types of Simple Geometries

Simple geometries start always with a point. A point can be identified by a vector which contain n elements, where n is the number of dimensions. For example, if we consider a 1D plain, then a point is simply a vector contains one number (coordinate) and in this special case only can a point represented as a scaler. If you were operating in a 2D space, a point will contain two elements p=[x, y], and in 3D it would be p=[x, y, z] and so on.

The second is lines, lines can be represented as two points, Lines extend beyond these points to span an infinite length while keeping its orientation. So in short a line is an ordered pair of points L=[p1,p2]. If the line has finite length, it can be represented as a set of points having the same orientation.

Lines that does not conform with the above discreption, such as roads are called line strings or poly lines, this kind of line is composed of a vector of lines where each line end connects to the next one beginning.

Triangles, squares, hexagons and polygon are all considered polygons. A polygon is represented as a set of lines, where each all the lines forms a closed shape.

Shapes where its inside is emptied or form more complex type of shape can be considered either multi-polygons or geometry collection, where geometry collection are used to define an object or more than one of the aforementioned types fused together.

Simple Operations on Geometries

Operations on geometries varies from calculating the distance of a line, or between two points. To smart grouping of points into clusters, investigating the center of a propagating signal, deciding whether a point, line, etc.. crosses or fits into another. To more complicated tasks such as routing, calculating shortest path, and more. If you are interested please like the post, and our page light syntax so you do not miss any go it

C# SELECT

It has been frustrating time working with several types of filters in dotnet framework specially while dealing with EFCore. So I made my decision to provide a library for generating those queries. Automating query generation and yet puts the developer in the front seat. Enjoy.

SELECT

Introducing our SELECT dotnet nuget package which is an IQueryable selection automation library for c#

LinQ query generator for c#

Applies to DB queries as well as dictionaries, lists, etc… (any class that implements IQueryable or IEnumerable)

Build Status

Features

  • Extremly easy to use
  • Light weight
  • Unify and simplfy Apis
  • Multi-platform
  • Works with relations, nested objectes, and Enums
  • Geometry Support
  • Pure c#
  • Opensource
  • MIT License

Markdown is an opensource free platform for automating search APIs while keeping the developer in control. It is built using the powerful features of Expressions in c#. Please feel free to contriute.

Installation

Using package manager

Install-Package SELECT -Version 1.0.0

Using DOTNET CLI

dotnet add package SELECT --version 1.0.0

Using PackageReference


Usage

  1. Create new Console Project in c# with name "test"
  2. Install SELECT
    dotnet add package SELECT --version 1.0.0
    
  3. Add class "Grade"
    public class Grade
    {
        public string subject { get; set; }
        public int grade { get; set; }
    }
    
  4. Add class "branch"
    public class branch
    {
        public string name { get; set; }
        public Grade grade { get; set; }
    }
    
  5. Add class "user"
    public class User
    {
        public string name { get; set; }
        public int age { get; set; }
        public branch branch { get; set; }
    }
    
  6. In the Main function, lets add some users
            List users = new();
            for (int i = 0; i &lt; 10; i++)
            {
                User user = new();
                user.name = i.ToString();
                user.age = i;
                user.branch = new branch
                {
                    grade = new()
                    {
                        subject = user.name + i.ToString(),
                        grade = user.age
                    },
                    name = &quot;test&quot;
    
                };
                users.Add(user);
            }
    
  7. Finally, lets use the package Lets make a request that select users who are 5 years old or younger, get their age, thier grade, and thier branch name Then order the results in a descending manner using the user’s age.
            Request request = new()
            {
                Items = &quot;age,branch.grade,branch.name&quot;,
                Order = new()
                {
                    IsAsc = false,
                    Name = &quot;age&quot;
                },
                Filters = new Filter[]
                {
                    new Filter(Operator.LtE,&quot;age&quot;,5)
                }
            };
    
  8. Retrive and print the results
          IQueryable result = users.AsQueryable().Construct(request);
          Console.WriteLine(JsonConvert.SerializeObject(result));
    

The complete example code

using System;
using System.Collections.Generic;
using System.Linq;
using Newtonsoft.Json;
using SELECT;
using SELECT.Entities;

namespace test
{
    class Program
    {
        static void Main(string[] args)
        {
            List users = new();
            for (int i = 0; i &lt; 10; i++)
            {
                User user = new();
                user.name = i.ToString();
                user.age = i;
                user.branch = new branch
                {
                    grade = new()
                    {
                        subject = user.name + i.ToString(),
                        grade = user.age
                    },
                    name = &quot;test&quot;

                };
                users.Add(user);
            }
            Request request = new()
            {
                Items = &quot;age,branch.grade,branch.name&quot;,
                order = new()
                {
                    IsAsc = false,
                    name = &quot;age&quot;
                },
                filters = new Filter[]
                {
                    new Filter(Operator.LtE,&quot;age&quot;,5)
                }
            };
            IQueryable result = users.AsQueryable().Construct(request);
            Console.WriteLine(JsonConvert.SerializeObject(result));
        }
    }

    public class User
    {
        public string name { get; set; }
        public int age { get; set; }
        public branch branch { get; set; }
    }
    public class branch
    {
        public string name { get; set; }
        public Grade grade { get; set; }
    }
    public class Grade
    {
        public string subject { get; set; }
        public int grade { get; set; }
    }
}

The Construct Method

This method acts on IQueryable objects, and recives a Request Object the Reuest Object consists of 3 Properties:

  1. Items is a comma separated string, where each substing contains a field name Nested Objectes can be accessed using the ".". For example; if you need to select ID, Name and Car model, the items string should be

     Items = "Id,Name,Car.Model"
    
  2. The order property is an object of type Order. This Object is resposable for ordering the result. This Object Contains 2 Properties

    Property Type Description
    name string The name of the property that is selected to order the reslts by
    IsAsc bool If true The order is ascending, else the order is descending
  3. The Filters Property is an array of the filter object. This specifies what result to reurn. Ie Applies its filters to the IQueryable object that acts on it. The filter Object is composed of 3 Components:

    Property Type Description
    fieldName string The field name we are setting a filter condition to
    value object The value we filtering againest
    op Operator The Operator used for filtring

    The operator object is an Enum Contains these operators

    Operator Description
    Eq Equal
    Lt Less Than
    Gt Greater Than
    In In
    Contains Contains
    GtE Greater Than or Equal
    LtE Less Than or Equal

    For Example If we need users Whos Id’s is less than 5, The Filter object will be:

    filters = new Filter[]
    {
        new Filter(Operator.Lt,"Id",5)
    }
    

    Or:

    filters = new Filter[]
    {
       new Filter(Operator.In,"Id",new long[]{1,2,3,4})
    }
    

    IF we need to get the Users Whos were last seen in a Egypt

            filters = new Filter[]
            {
                new Filter(Operator.In,
                "Location",
                "POLYGON((-335.1708984375 29.382175075145298,-334.9951171875 31.54108987958584,-325.7666015625 31.35363694150098,-325.01953125 29.535229562948473,-326.03027343749994 26.980828590472115,-323.0419921875 21.902277966668635,-334.9951171875 21.902277966668635,-335.1708984375 29.382175075145298))")
            }
    

Stack Using Golang Generics

With the recent introduction of generics in Golang, they have proved themselves not only elegant and simple, but also very efficient, check out the performance characteristics here. Now it is time to demonstrate their power in creating flexible data structures in this post. As promised in a later post, we are going to implement a stack in this post. The rest of this post is organized as follows:

  1. what is a stack
  2. stack methods
  3. implementation in Golang
  4. testing our stack
 

 

What is a Stack

Stacks are simple data structures that powers a great deal of modern computational systems like function calls in compilers themselves, LRU policy of memory block replacements, bracket matching problems and many many more. Stacks were first introduced in 1964 by Alan M. Turing and they were gaining popularity ever since.

 

 
 
 
A stack acts as a first in last out data structure most commonly known as LIFO. In layman terms this means it is a data store where you can insert an element at a time, but you can retrieve these elements back in reverse order. For example, if we insert these integers in order to a stack (Push them) 1,2,3,4,5 we can retrieve (Pull) them in reverse order i.e. 5,4,3,2,1. In other words, a stack act as a pile of books, you can always add books to the top of this book stack, but you can access the most recent that you but in.  If you need to find more, click here.
 
 

Stack Methods

Stacks usually poses four methods, these methods are:
  1. Init
  2. Pop
  3. Push
  4. Peak

The “Init” method usually used to instantiate the stack i.e allocate memory to be used in the stack. In strongly typed languages like Golang, it is required to define the type of elements that goes into the stack. In addition to the memory allocations, it is required for the stack to be thread safe, if it is going to run in a multi-threading environment. 

 
The “Pop” method simply removes the last element of the stack and return it to its caller. Thus achieving the purpose of LIFO.
 
The “Push” method simply appends an item to the top of the stack, to be retrieved later using the “Pop” method. 
 
The “Peak” method acts very similarly to the “Pop” method but without actually removing an item from the stack. 
 

Stack Implementation in Golang

Here goes the interesting stuff. First we create a struct that will represent our stack. This struct will contain a slice that will represent our data, a locking mechanism for safe threading. It’s required to keep these struct members private so no one miss with them from outside our package. These Items will be manipulated by our receiver functions only. So our struct will look something like this:

So in order to instantiate this struct and use it, we might provide a method to do that. This method will return a pointer to a stack so the caller can invoke the struct methods on them. The Other Three methods required by a stack to function probably are displayed below. Notice here that we do not allow simulations reads/writes to our data structure.

Testing our Stack

To test our stack, we first insert (Push) the items 0,1,2,3,4 and use the Pull method to retrieve them back. Here is the code snippet:

Happy Generic Stacks 🙂


 


 

Golang Generics Performance Evaluation and Implications

 

Generics is the latest upcoming feature in Golang. Introducing high level elegant syntax along with infinite possibilities. If you need an introduction kindly visit our post on Golang generics.

Unlike the previous post, This post is more concerned about evaluating the performance of Golang Generics and some future perspectives and implications. Any source code snippets used in this post can be found here. The rest of the post is organized as follows:

  1. Introduction to generics
  2. Benchmark experiments strategy
  3. Benchmark results
  4. Implications on Golang future

Introduction to generics

Generics has been around and used extensively (specially when writing libraries or frameworks) in many programming languages like c++, c#, java. However, generics are known to have some extra computational cost to explicitly typed code. With that in mind, it has been always a good idea to write in generics rather than explicit coding due to its huge advantages in readability, maintainability and code reuse.

The introduction of generics in Golang may raise many questions regarding its performance, backward compatibility, updating currently used standard libraries (which are build on interfaces and reflections) and whether it is worth it to use generics in Golang. To find the answers to this and more keep reading.

Designing the performance benchmark experiment 

 We will consider evaluating both time and memory costs in these two situations:

  1. light weight simple operations
  2. computationally demanding operations

And for each one of the previously aforementioned scenarios, we will compare the cost of using generics to the cost of using explicit typing and legacy interfaces.

Simple tasks

For a simple operation we might consider a task of adding two numbers. The next three code snippets will demonstrate the use of explicit type style, generic style and the legacy interface style respectively.

 

The Benchmarking code for these samples is demonstrated in the next three code snippets in the same order

The command to run those benchmarks is:

Computationally demanding tasks

For a computationally expensive task we have selected the famous Fibonacci” in its naïve implementation. As we did with the simple task evaluation, we are implementing three versions: an explicit style, a generic style and a legacy interface style. The code for these variations is listed below in a respective order along with their benchmark code.

Benchmark results

Testing environment :

We ran this test on a powerful Mac OS machine equipped with 24 core xeon processor along with 48GB Ram. Each test ran 100 times for accurate evaluation.

Simple task analysis:

As we may have noticed in the above figure, both the generic and explicit implementations had almost the same execution time and number of memory allocations which is rather impressive performance for a generic function compared to a an explicitly typed function. The legacy interface function came far behind them in terms of both memory and cpu time. The graph below compares the running time of these tasks in a bar plot graph.

 

Expensive task analysis:

The gap in the performance between the three techniques became more pronounced in a computationally extensive task as expected. The explicit code came a tad faster than the generic code, but not that far ahead, thus maintaining a relatively high performance and keeping simple elegant and readable code. The legacy implementation came worth as expected in both memory and time. The next bar plot graph illustrates this fact.
 
 

Which lead to the next section of this post, how will this affect the language, it’s usability and standard libraries?

Implications on Golang future

In this section we will discuss the impact of these changes and benchmarks on:
  1. currently written Golang standard libraries 
  2. currently running code in production servers
  3. code that will be written in the future

Currently written Golang standard libraries

Many of the current Golang standard libraries such as the famous “sort” were written using the legacy interface paradigm, furthermore they are used extensively in production code due to their famous reliability. So we can only wait to see if google will decide to update their libraries to take advantage of their new emerging generics technology. I’m guessing that the answer would be probably a big fat “FOR SURE” which leads us to the next bit.

Currently running code in production servers

If the previous assumption will evaluate to true, that means a lot of currently written packages that use Golang native libraries will break and will not be able to upgrade to the next version of go without considerable changes in their software artifacts. 

Code that will be written in the future

This is a happy time to consider starting using Golang with it’s new feature in your next project, taking advantage of all of its out of the box high performance tools and simple syntax. Happy coding.
 

Generics In Golang

Featured

Golang is an emerging technology that took the development world by storm. Introducing rapid software development that is consistent, strongly typed, insanely fast performance, and simple concurrency model; that is cheap, easy to learn and debug. 

As promised by google on twitter, finally generics are introduced in Golang.
 

This post will introduce the upcoming feature in Golang 1.18 called Generics. I’m going to assume basic familiarity with Golang. The rest of content is organized as follows:

  1. What are generics
  2. Why Golang?
  3. Golang before generics
  4. Generics in Golang
  5. Install Golang 1.18
  6. Simple stack using generics
  7. Implementation in Golang.

What are generics?

Generics are programming concepts that applies to functions, classes, and methods. That allows the usage and declaration of strongly typed software artifacts with complete independence on the type/s of their parameters. This can be achieved in a two step process.

First the function, method or class will provide a mechanism to accept the types as parameters.
Then the caller must provide those types when using the method. For example, if we need to provide a static function that receives two objects and returns the addition result.

Instead of writing a function for each type, we can make use of generics. As an illustration of how generics works in other languages, here are some examples in c++ and c#:

 

The above code snippets simply calls the add method with different data types and returns the addition result in the same data type as the arguments.

Why Golang?

If you haver ever used Golang (which I assume that you are) you may have came across some of its powerful syntax and programming paradigm. Golang natively included all the tools required to develop feature rich cross-platform applications. Including an impressive concurrency model, fault tolerance, amazingly realtime performance, unified database interface,…..

This completed with an outstanding community and google ownership leaves no stone unturned. Golang has unprecedented code reuse ability and package distribution. You can check the complete list of features here.

 

Golang before generics

Before the formal introduction to generics in Golang, creating a generic code was not impossible, but it involved the usage of type reflections, interfaces, and it was a bit frustrating to review and develop. This is what the  aforementioned add function would look like without the use of generics:

 

 
  • The idea here is use an empty interface to accept any kind of argument at line 9.
  • lines 10 throw 13, we check if both of the arguments are of the same type.
  • Lines 15 throw 31, checks the type of the arguments, casts the interface to that type and  perform the addition.
  • Lines 32 throw 34, returns error if the supplied type is not supported by the function.

Install Golang 1.18

In order to start using Golang generics, you must install Golang 1.18 or newer. Assuming that you have already an order version of Golang installed.
Just use this script.

 

If this works fine, you should see Golang 1.18 printed on the terminal window, If not please make a comment to get help.

 

Simple stack using generics

A very famous and useful application of generics is using stacks. If you don’t know about stacks, click here.

Generics can be a lot of help when it comes to the implementation of different data structures such as stacks, heaps, graphs, etc.. because it allows the development of a generic data structure rather than a typed locked ones.  We are going to implement the following methods:

  • pop: removes the first item of the stack
  • push: appends an item to the stack
  • peak: gets a peak at the next item in the stack

Implementation in Golang

The implementation of the stack itself would take place in an upcoming post. Here I would like to implement the add function (we have reviewed in c++ and c#) in Golang. And here it goes:

 
As we may have notice, there is some slight differences in Golang generics than other languages. For example: Golang used square brackets rather than the symbol “”.
Golang went further, it provided a mechanism (an optional mechanism i.e. not required) to restrict the type of data that goes into the function.
 
In the last code snippet the function “add” accepts two parameters a,b. The type of these parameters is denoted by the letter “T” where “T” can be any of these types. The function add also returns a variable of type “T”.
 
A detailed analysis and benchmarks of generics in Golang is to be expected in this blog. Stay tuned.
 

About Me

Ahmed Ezzat, MS,

Experienced Senior System Architect with a demonstrated history of working in the information technology and services industry. Skilled in Software Architecture, Programming , Requirements Analysis, Computer Science, and Databases. Strong engineering professional with a Master of Science – MS focused in Computer Engineering from Cairo University.

Published work:

1- GDD: Geometrical driven diagnosis based on biomedical data 

Egyptian Informatics Journal,
Volume 21, Issue 3,
2020,
Pages 183-190,
ISSN 1110-8665,
Abstract: Modern medical diagnosis heavily rely on bio-medical and clinical data. Machine learning algorithms have proven effectiveness in mining this data to provide an aid to the physicians in supporting their decisions. In response, machine learning based approaches were developed to address this problem. These approaches vary in terms of effectiveness and computational cost. Attention has been paid towards non-communicable diseases as they are very common and have life threatening risk factors. The diagnosis of diabetes or breast cancer can be considered a binary classification problem. This paper proposes a new machine learning based algorithm, Geometrical Driven Diagnosis (GDD), to diagnose diabetes and breast cancer with accuracy up to 99.96% and 95.8% respectively.
Keywords: GDD; Medical diagnosis; Classification; Machine learning; Diabetes; Big data; Bioinformatics

2- Search space pruning by evolving probabilistic strategies

M. B. Fayek and A. Ezzat, “Search space pruning by evolving probabilistic strategies,” 2017 12th International Conference on Computer Engineering and Systems (ICCES), 2017, pp. 487-492, doi: 10.1109/ICCES.2017.8275357.

Experience:

Computational geometry engineer @GulfComms

System architecture engineer @ProSyLab

Senior Industrial Embedded System Engineer @VDIS