Tuesday, February 11, 2014

My take on TDD and Big Data

Following a discussion with Gil Zilberfeld on TDD and Big Data during the recent APIL14 conference, Gil sent me this video:



Admittedly I have approx 10 years of experience with TDD and BDD, but have never had the chance to try TDD with Big Data. It sounds challenging and interesting.

My own take on this is similar to what the video described - I believe in BDD (Behavior Driven Development) and that seems to fit nicely in the Big Data world.
1. Define your expectations using a scenario (with Given-When-Then language or otherwise define the expected behavior of the system or a subset of it)
2. See the test fail
3. Build the system (data pipelines, filters etc) so it makes the scenario pass
4. Refactor, rinse and repeat...

Here is a contrived pseudo-code of what I have in mind:
Suppose I am google (hey, why not?) and I want to build a search engine (how come they never thought of it?)
The search engine would have a crawler that crawls websites and streams data into a distributed big-data index.
People at home would enter a search query and get results from the index. Obviously the results might vary depending on how close the person is to a data node, and on if the data has synced between nodes yet.

So... ready? (I know I'm not but what the heck, it's only the whole internet reading this)

Scenario 1: Indexing a site into one node, and searching from another node before data has synced

Given two nodes
and given a website called "Agile Shmagile"
and given that the site is only indexed in node1
and given a user that is close to node2

When  the user searches "Agile Shmagile"
Then she should not get the website in the search results

Scenario 2: The site get's synced into node 2 - and now the user should get the website in the search results

Given all of the above from Scenario1
and given that the website has synced into node2

When the user searches "Agile Shmagile"
then she should get the website in the first place in google
and "Agile Shmagile" authors should get a trillion bucks
and they should retire to the Bahamas....

Hmmm.... I think it's good to go live - I just need to make that last assertion pass...
Bahhh - should be easy.

:)


No comments:

Post a Comment