Skip to main content

Featured Post

Data Mining with Weka -Installation

Weka - Data mining Tool W eka is a tool for big data and data mining. It is used to various classification, experiments, and analysis over large data sets. Installation Guide -weka  You can download Weka from   here   and follow the normal installation procedure. After completion you will get following window, here you can begin your classification or experiment on different data sets with Weka.

Yahoo ! S4 - Analysis of Distributed System

Analysis of Distributed System -Yahoo ! S4


S4 is designed on the context of the search engine (Yahoo! Search Engine) which supports data mining and machine learning algorithms, instigate on MapReduce model. So, it makes possible to parallelize and distribute batch processing tasks and operations in immense clusters without less or no human intervention over issues like failover management. It is low latency scalable stream processing engine which streams the event flow at given data rate automatically.

Unlike Hadoop (the popular batch processing system), S4 works based on MapReduce (stream processing system typically operate on static data by scheduling batch jobs).

On the contrary, it needs segment partitioning of the input data in fixed sized segments to be processed by MapReduce platform where latency is proportional to length of the segment plus overhead requirement for segmentation and initiates processing jobs; apparently it’s a tradeoff between latency and segmentation process.

S4 shares the purpose of big data and data mining with IBM stream processing core (SPC) middleware besides having primary architectural design differences. While IBM-SPC is derived from subscription model, S4 is combination of MapReduce Model and Actor Models.

S4 has simple and exquisite cluster management system (do not have a single centralized nodes on cluster), which is accomplished by leveraging ZooKeeper that has a significant multiple sharing feature by user on data center.

S4 stream defined as events on the form of tuple valued keys and attributes where most frequent input with minimal latency is tuple value keys.

The S4 design (for Yahoo! search engine) includes Processing Elements (PEs) as basic computational units (the user inputs for the search engine), Processing Nodes (logical hosts to processing elements and listen to events), Communication Layer (which maps logical nodes to physical nodes and automatically re-map over failures and coordinate between nodes using Zookeeper), and Configuration Management Models (human interventions for setup and tear down cluster for s4 tasks).

For programming models, S4 processing elements APIs are written in Java Programming Language and communication layer APIs are written in binding of several programming language (i.e. java, C++ etc.).

Furthermore, the implementation of S4 is optimized based on online parameters which apply for searching of favorable content results by tuning the advertising system.

Overall, the Yahoo! S4 Architecture is simple and elegant for its search engine, and successfully run on real traffic slices of a search advertising system where slices are based on user space for the thousands of users per day, though it has some issues like tradeoff between latency and segmentation is need to be improved and processing elements migration is fragile which also needs to be strong enough and sustainable. Moreover, it uses static routing and lacks dynamic load balancing.    

Comments

Popular posts from this blog

Roles of multimedia team members

  Multimedia Team ( Project manager, Multimedia Designer, Interface Designer, Writer, Video Specialist, Audio Specialist, Multimedia Programmer, The Sum Of the parts ). A typical team for developing multimedia for CD-ROM or the Web consists of people who bring various capabilities to the table. Often, individual members of multimedia production teams wear several hats. 1.       Project Manager: A project manager’s role is at the center of the action. He or she is responsible for overall development and implementation of a project as well as for day-to-day operations. Budgets, schedules, creative sessions, time sheets, illness, invoices, team dynamics-the project manager is the glue that holds it together.                         2.       Multimedia Designer : Multimedia designers need variety of skills. We need to be able to analyze content structurally and match it up with effective presentation methods. We need to be an expert on different media types, and capable media inte

Stages and Requirements of multimedia of project

  THE STAGES OF PROJECT :   Most multimedia and Web projects must be undertaken in stages. Some stages should be completed before other stages begin, and some stages may be skipped or combined. Here are the four basic stages in a multimedia project. Planning and costing : A project always begins with an idea or need that we refine by outlining its messages and objective. Before we begin developing, plan what writing skills, graphic art, music, video, and other multimedia expertise will be required. Develop a creative graphic look and feel, as well as  a structure and navigation system that will let the viewer visit the messages and content, estimate the time needed to do all elements, and prepare a budget. Work up a short prototype or proof of concept  Designing and Producing :  The major goal of this phase is to translate the problem studied in the first phase and design made in second phase into proper a finished project. Generally, pr

C-program(Sequence) Age group display

14. Write a program to find age group on the basis of age . Age/Group: 0-10/Child, 10-19/Teenage, 19-40/Young, Above40/Old. #include<stdio.h> #include<conio.h> void main() { int age; printf("Enter the age of person:"); scanf("%d",&age); if(age>0 && age<=10) printf("Child"); else if(age>10 && age<=19)   printf("Teenage");   else if(age>19 && age<=40)   printf("Young");   else   printf("Old");   getc h(); }