Our big data services

We have experience building big data analytics and web mining solutions using a variety of technologies.

We often use our own ADA framework when building analytics solutions because it gets us from raw data to visual analytics faster than anything else we know.

Below are some of the areas where we can help.

Solution architecture and design

The big data technology space is very active and new technologies and infrastructures are entering the space on a regular basis making their own claims to technical excellence and fit-for-purpose.

Big data technologies available have their own set of benefits and drawbacks with regards to consistency, data availability, scalability, integration with other tools, flexibility and ease of operation and a whole range of other considerations that should be kept in mind during design.

We can help design your overall big data analytics architecture based on best-practices and well-working technologies that meet your needs today while anticipating those of tomorrow. This architecture can be on-premise or in-cloud, depending on what fits your needs best.

Large-scale data collection and management

An important aspect of any big data solution is collecting and managing large amounts of source data to be analyzed. These data are sometimes diverse in terms of both formats and systems and should be collected in a reliable and non-intrusive manner.

Storing data long term also has introduces challenges as formats inevitably change over time. Some log data is also sensitive in its nature and should be managed in a secure way that does not reveal a persons identity even to developers working on your analytics solution.

We advice on how such data should be collected and stored in your big data solution, including best practices for dealing with security and anonymity.

Advanced analytics — beautiful visualisation

Grouping and counting data and visualizing them as trends is useful for many kinds of analytics. We can help you go beyond simple analytics when necessary to really understand what your data mean by using statistical modeling and machine learning techniques as a means to get insights into data, trends or emerging trends.

We take pride in making our visualisations easy to understand and while also being pretty by using the latest in open source web technologies.

Exporting analytics data to other formats importable by Excel or statistics packages is of course also possible.

Flexible cloud deployments

As long-term users of cloud services such as Amazon Web Services, we can help both design and deploy your cloud based big data analytics infrastructure in a cost-effective and flexible manner that fits your needs — both current and future.

ADA is our super agile framework for big data analytics.

We built ADA because most BI frameworks are heavy even though they promise to be light. ADA gets you from raw logs to visual analytics in just a few hours (or less).

ADA is built by engineers for engineers and uses the latest in big data and web technologies.

Major components

  1. Generic and extensible tool for data loading (ada-load)
  2. Generic and extensible tool for running analytics jobs (ada-summarize)
  3. Skeleton application for data visualisation
  4. JavaScript data visualisation library

Data conversion and jobs

Analytics with ADA is very simple. Even if our existing data-adapters (CSV, TSV, Apache HTTP Server logs, etc.) do not fit your data, you can easily write your own and plug it into ada-load to load data. Your code's focus is only on converting data and that's it.

After loading data – typically into Hive – you can start writing analytics jobs. Many users use Hive because of its similarity to SQL and overall ease of use, but ada-summarize supports MongoDB, PostgreSQL and other pluggable backends as well.

Writing analytics jobs typically means writing queries in files that are being run as analytics jobs by ada-summarize. No other programming that writing your analytics query is necessary. ada-summarize runs jobs on the Hadoop cluster (typically) and stores summary data into MongoDB (typically).

Data visualization

We provide a skeleton web application with a tab/panel-model that allows you to easily add your own data visualization panels. Adding a panel is done in two simple steps:

  • Step 1: Define overall page structure using HTML and make some DIVs to be filled with analytics.
  • Step 2: Use the ADA JavaScript library to populate the above DIVS with data shown as tables, charts, trends, maps, etc.

Steps 1 and 2 above is usually just 10-30 lines of code and our framework takes care of loading, displaying and navigating data.

Source code license included

We include a full source code license to ADA in big data consulting engagements where ADA is being used, which gives you the flexibility of using it as a start point to build out your own analytics platform without any vendor lock-in.