The role sits within our Decision Analytics, one of our four Global Business Lines.
Experian Decision Analytics helps client achieve and sustain significant growth. We do this by enabling clients to make analytics-based customer decisions that support their strategic goals. As experts in uniting business understanding with consumer and business information, analytics and strategy execution, we empower clients to optimise customer value and actively manage it over time. This role therefore has clear accountability for creating measurable value within our client organisations.
What you’ll be doing
In this role, you’ll be working as an Analytics Engineer within Analytics Center of Excellence on Experian’s internal Cloud Platform. You’ll be managing data at scale and working on key initiatives such as Data Productizations where we employ different big data technologies and back/front ends to build solutions for our clients.
More about you
- 2-5 years’ experience in data warehouse technologies, data modeling and ETL development
- Experience in a high-performing, large-scale technology driven environment
- Exposure in creating/maintaining Data pipeline (Apache Spark or similar) and workflow tools (Airflow or similar) for both Realtime and batch use cases. Cloud experience (Azure is preferred)
- Basic understanding of ML, Deep learning, Wrappers and APIs
- Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O (Mandatory)
- Strong coding skill in SQL and (Python or Java or Scala)
- Exposure to Database architecture using RDBMS or NoSQL: Views/Tables, Disk usage and relational diagram
- Excellent communication, interpersonal skills, and proven project management skills.
- Management of Hadoop cluster, with all included services, and ability to solve any ongoing issues with operating the cluster
- Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming (Good to have)
- Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
- Experience with integration of data from multiple data sources
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Good understanding of Lambda Architecture, along with its advantages and drawbacks
- Experience with Cloudera/MapR/Hortonworks