Looking for best BigData approach for my project

I am new to Big Data and have to figure out the best technological approach and tools for the following project.

I have four types of entities: Objects, Frames, Tags and Recordings.

  • Recordings are video recordings that are the source of all the data described below. I have around 5000 hours of video in 100,000 recordings.
  • Frames are the video frames belonging to the above Recordings. At 10fps I have around 200 Million frames.
  • Objects are objects appearing in the frames. There are around 100 objects per frame so around 20 Billion Objects. Each object has size (W, H, D), Speed and Direction.
  • Tags is an enumeration. There are 100 unique values, and each frame may have zero to 20 tags associated with it. So around 2 Billion tags.

A Longest Frame Range is defined as the longest range of consecutive frames from a single recording which satisfy certain constraints.

The hardest query I need to solve is this:

Find in all the Recordings, the Longest Frame Ranges 
	Which are longer than X Frames, 
	And contain Objects with specific size and speed, 
	Where all frames contain a specific list of Tags
	And the frames belong to the same Recording. 

This query needs to return a list of Frame Ranges in less than 5 minutes, if possible less. It will be run around 10 times per day.
The data is well structured by its nature. The amount of data is constantly growing. Currently I only have 5% of the data in a local MySql but I will move it to AWS to whatever tools needed. My main concern is speed but the approach should also be cost effective in terms of storage and computational resources cost.

Of all the different approaches: Relational, Document, Column, Key-Value and Graph, and all the tools out there, which approaches and tools should I try first?

1 Like

Hi @kshepitzki, you can consider Hevo Data (https://hevodata.com/) to transfer data from MySql to AWS. The data pipeline provided by Hevo Data is very easy to set up and the support team is also very responsive.

1 Like
© Copyright 2013-2022 Analytics Vidhya