Big Data, Machine Learning and Scientific computing libraries for Go Language?



Hi all,

I have started using Go language for some of the experimental stuff I am working on. The performance of the language looks promising, but there is a lot under development.

I wanted to know, if any one has used GO for any Big data and machine learning applications? If yes, which libraries have you used. Any learnings / challenges, you would want to share?




I have done some experimentation with data science libraries in GO. Here is an overview of various libraries available on GO currently:

Machine learning related libraries:

bayesian - Naive Bayesian Classification for Golang.
CloudForest - Fast, flexible, multi-threaded ensembles of decision trees for machine learning in pure Go.
go-fann - Go bindings for Fast Artificial Neural Networks(FANN) library.
go-galib - Genetic Algorithms library written in Go / golang
go-pr - Pattern recognition package in Go lang.
gobrain - Neural Networks written in go
godist - Various probability distributions, and associated methods.
GoLearn - General Machine Learning library for Go.
golinear - liblinear bindings for Go
goRecommend - Recommendation Algorithms library written in Go.
libsvm - libsvm golang version derived work based on LIBSVM 3.14.
mlgo - This project aims to provide minimalistic machine learning algorithms in Go.
neural-go - A multilayer perceptron network implemented in Go, with training via backpropagation.
probab - Probability distribution functions. Bayesian inference. Written in pure Go.
regommend - Recommendation & collaborative filtering engine
shield - Bayesian text classifier with flexible tokenizers and storage backends for Go

NLP (Natural Language Processing) libraries:

go-eco - Similarity, dissimilarity and distance matrices; diversity, equitability and inequality measures; species richness estimators; coenocline models.
go-nlp - Utilities for working with discrete probability distributions and other tools useful for doing NLP work.
go-stem - Implementation of the porter stemming algorithm.
golibstemmer - Go bindings for the snowball libstemmer library including porter 2
gounidecode - Unicode transliterator (also known as unidecode) for Go
icu - Cgo binding for icu4c C library detection and conversion functions. Guaranteed compatibility with version 50.1.
libtextcat - Cgo binding for libtextcat C library. Guaranteed compatibility with version 2.2.
MMSEGO - This is a GO implementation of MMSEG which a Chinese word splitting algorithm.
paicehusk - Golang implementation of the Paice/Husk Stemming Algorithm
porter - This is a fairly straighforward port of Martin Porter’s C implementation of the Porter stemming algorithm.
porter2 - Really fast Porter 2 stemmer.
segment - A Go library for performing Unicode Text Segmentation as described in Unicode Standard Annex #29
snowball - Snowball stemmer port (cgo wrapper) for Go. Provides word stem extraction functionality Snowball native.
stemmer - Stemmer packages for Go programming language. Includes English and German stemmers.
textcat - A Go package for n-gram based text categorization, with support for utf-8 and raw text

Since this is undergoing fast development, I would recommend that you check latest project pages on any further developments.

Scientific computing & data analysis

blas - Implementation of BLAS (Basic Linear Algebra Subprograms)
ewma - Exponentially-weighted moving averages
geom - 2D geometry for golang
go-fn - Mathematical functions written in Go language, that are not covered by math pkg
go-gt - Graph theory algorithms written in “Go” language
go.matrix - linear algebra for go (has been stalled)
gocomplex - A complex number library for the Go programming language.
gofrac - A (goinstallable) fractions library for go with support for basic arithmetic.
gohistogram - Approximate histograms for data streams
gonum/mat64 - The general purpose package for matrix computation. Package mat64 provides basic linear algebra operations for float64 matrices.
gonum/plot - gonum/plot provides an API for building and drawing plots in Go.
goraph - A pure Go graph theory library(data structure, algorith visualization)
gostat - A statistics library for the go language
mudlark-go - A collection of packages providing (hopefully) useful code for use in software using Google’s Go programming language.
streamtools - general purpose, graphical tool for dealing with streams of data.
vectormath - Vectormath for Go, an adaptation of the scalar C functions from Sony’s Vector Math library, as found in the Bullet-2.79 source code. (currently inactive)

Libraries for connecting and operating databases.

Relational Databases

firebirdsql - Firebird RDBMS SQL driver for Go
go-adodb - Microsoft ActiveX Object DataBase driver for go that using database/sql.
go-bqstreamer - BigQuery fast and concurrent stream insert.
go-mssqldb - Microsoft MSSQL driver prototype in go language.
go-oci8 - Oracle driver for go that using database/sql.
go-sql-driver/mysql - MySQL driver for Go.
go-sqlite3 - SQLite3 driver for go that using database/sql.
gofreetds Microsoft MSSQL driver. Go wrapper over FreeTDS.
pq - Pure Go Postgres driver for database/sql.

NoSQL Databases

aerospike-client-go - Aerospike client in Go language.
cayley - A graph database with support for multiple backends.
go-couchbase - Couchbase client in Go
gocb - Official Couchbase Go SDK
gocql - A Go language driver for Apache Cassandra.
gomemcache - memcache client library for the Go programming language.
gorethink - Go language driver for RethinkDB
mgo - MongoDB driver for the Go language that implements a rich and well tested selection of features under a very simple API following standard Go idioms.
neo4j - Neo4j Rest API Bindings for Golang
Neo4j-GO - Neo4j REST Client in golang.
neoism - Neo4j client for Golang
redigo - Redigo is a Go client for the Redis database.
redis - A simple, powerful Redis client for Go.

Search and Analytic Databases

bleve - A modern text indexing library for go.
elastic - Elasticsearch client for Google Go.
elastigo - A Elasticsearch client library.
goes - A library to interact with Elasticsearch.

You can look here for a more comprehensive list: