- Job Number: 26671823
- Santa Clara Valley, California, United States
- Posted: Oct. 26, 2013
- Weekly Hours: 40.00
Job Summary
Seeking
a high-energy, experienced Big Data Engineer with a proven record of
developing advanced analytics on complex data from multiple sources on a
Hadoop technology stack. The ideal candidate will have experience
building reporting solutions using both NoSQL solutions and relational
database/SQL solutions.
Key Qualifications
- QUALIFICATIONS:
- 3 to 5 years of Core Java development within large distributed systems
- 2+ years experience Python Programming
- 2+ years experience as competent developer of Hadoop Map/Reduce jobs for large scale data processing (i.e. HBase, Pig, etc)
- Strong development skills around Hadoop, Hive, Pig, HBase, Mahout/R, Map Reduce and Web Services
- Experience with big data solutions using major Hadoop distros and platforms like Cloudera, HortonWorks, MapR or Amazon EMR is essential. NoSQL databases like HBase, Cassandra and MongoDB a plus.
- Solid understanding of extract, transform, load (ETL) methodologies used to pull data from multiple systems that are a mix of traditional relational database management systems and newer NoSQL big data systems, such as Cassandra and Hadoop.
- Should have experience with architecting and developing distributed and highly scalable fault tolerant systems
- Ability to quickly come up to speed on and contribute to complex code with minimal supervision.
- Aptitude to independently learn new technologies.
- Must be hard working, team oriented, bright, creative, cooperative, and an exceptional problem solver
Description
Responsible
for writing code to manipulate big data from various sources into
Hadoop.
Design and develop solutions using Hadoop to tackle big data,
information retrieval, and analytics problems.
Design and build a fast analytics environment on large datasets/clusters
Provide engineering and design support for Hadoop build, configuration,
monitoring and supportability
Innovate practical NoSQL solutions to conquer scalability and
distributed data processing challenges
Design, develop and support a map-reduce-based data processing pipeline
to process reports on number of growing NoSQL big data solutions used to
support new applications and products
Responsible for data accuracy, scalability and integrity on Hadoop
platform
Work with management to align solutions with business strategy and
objectives
Strong sense of ownership, urgency, and drive with a demonstrated
ability to achieve goals in a highly innovative and fast paced
environment.
Play a key role in rethinking and building our future reporting and
analytics infrastructure.
Education
BS, MS or PhD in Computer Science, Engineering, Math or Physics (or equivalent experience)
Additional Requirements
Passion for working with Big Data (unstructured and structured)
Interest and passion for online media and the future of advanced analytics
Desire to build foundational technologies that define a new eco-system
Experienced or familiar with Cassandra
Experienced or familiar with Voldemort
Experienced or familiar with SOLR
Experienced with Oracle SQL and PLSQL
Solid background in Data warehousing principles, architecture and its implementation in large multi-terabyte environments.
Experienced merging data from NoSQL platforms and Oracle to achieve desired outcome
Proficient with shell scripting, and highly comfortable on Unix/Linux environment
Comfortable with high levels of responsibility in a fast-paced team environment
Excellent written and verbal communication skills
0 comments:
Post a Comment