Conference Chair’s Opening Address
Professor Marc Salomon, Dean of the Amsterdam Business School (ABS), Director of the MBA in Big Data & Business Analytics and Member of the Big Data Alliance Board
Data governance strategies in the age of disruption
Anwar Mirza, Global Head of Data Governance, TNT
The outline of a Data Governance programme, framework and methodology that is embedded into the core of a company’s foundation.
- A formal approach to tangibly valuing the cost of poor quality data
- Exposure of obvious business white spots through the identification of ‘must do’ data governance controls
- How to prepare yourself for governing innovation, disruption and changing business strategies.
Streams in the Digital Natives
Mic Hussey, Senior Systems Engineer, Confluent
It’s been said that “before you hire your first data scientist you should hire a data engineer”. What’s the big deal? Surely moving data around is easy?
Let’s talk about how modern digital natives deal with huge volumes of data and make it useable to build insight and operationalise their learnings.
Delivering maximum value from data science projects through productionalizing machine-learning models
Teren Teh, Vice-President, Market & Customer Insights, Barclays; CEO/Founder, Orchestra
Deploying your machine-learning models to production is a crucial step in transforming data science research into real value for your company.
This presentation will guide you on how to make the most of your data science projects and successfully deploy your models.
Ingredients for Successful Machine Learning Projects and Development of AI
Redouane Boumghar, Former mentor at NASA Frontier Development Lab; Member, Libre Space Foundation; former Data Scientist at European Space Agency.
How can your organisation maximise the benefits of its machine learning projects and make them solid building blocks for Artificial Intelligence?
For all the hype around machine learning, it can often be difficult to find the right area and the right dynamics to develop and deploy it in your own organisation.
We share why involving your company with Open Source communities can help to reduce costs and increase knowledge and how it has become an essential part for the development and deployment of AI. We explore key ingredients you should consider to naturally generate business cases ideas and engage your whole company to unlock business value.
Improving Your Training Data for ML in Production
Kasper Knol, Data Scientist, Ford Motor Company
Having accurate results is of tantamount importance when you are looking to deploy ML models in production. And the easiest way to improve accuracy is to improve your training data.
This presentation examines the practical steps you can take to create better ML results, looking at:
• Simple steps you can take to vet input data
• Using clusters and other visual tools to spot discrepancies
• Learning from feedback and testing to achieve high accuracy in production
The Panama Papers: Revolutionizing Investigative Journalism with Open Data
International Consortium of Investigative Journalism (ICIJ) case study presented by Jaap-Jan Pepping, Regional Business Director – Netherlands & Belgium, Talend
On May 2015, the International Consortium of Investigative Journalists (ICIJ) obtained from German newspaper Süddeutsche Zeitung an encrypted hard drive with leaked data from the Panamanian law firm Mossack Fonseca. The total size of documents received would end up being 2.6 TB and 11.5 million files. The ICIJ Data and Research Unit, with staff in four countries in two continents, started looking at how to process and analyze the data. The main challenges were dealing with dozens of data formats, putting them into a consistent and visual database, and then making all this data available to journalists worldwide. It immediately became clear that inside the leaked records were files of Mossack Fonseca’s client database, with information of who was secretly using the offshore world. The final challenge was to reverse engineer and reconstruct that database so that journalists – and ultimately the public – could use it.
• Reconstruct a database of 2.6 TB of data and 11.5 million documents
• History’s biggest data leak: A list of over 210,000 offshore companies across 21 jurisdictions
• 140 politicians from more than 50 countries connected to companies in tax havens
• 70 million page views from countries all around the world
Questions to The Panel of Speakers
Morning Networking and Refreshments Served in the Exhibition Area
What Makes a Good Data Visualisation?
Martijn Scheele, Head of Data & Analytics, Dutch Railways
Isabel Bevort, Data Scientist-Researcher, Dutch Railways
Data visualisations are still the best way of transforming the insights gleaned from tangled masses of data into stories that the rest of your organisation can understand and act on.
Data visualizer are often caught between two competing forces: creating graphs and visuals that are simple to read on the one hand, and making sure they are complex enough to reflect the realities of the data available on the other.
This presentation shows how you can overcome this by:
- Starting with the story you want to tell from the data, and building from there
- Balancing simplicity and complexity in data visualisation
- Incorporating feedback from other business areas into your designs
ING as a Global Event-Driven Bank
David Vaquero López, Global Lead Architect, ING Bank
ING is under a transformation journey driven by data triggered via events globally scoped. ING wants to have a global event-driven bank driven by data, and making all that data, globally triggered, accessible and ready to be processed to make the right decision in the right moment in time, to actively reach out to the customer or maybe turn on an internal alarm about any type of fraud. In any case, ING wants use the data that we have as a bank, to help and improve the financial lives of our customers in a frictionless way.
ING is using a Global EventBus, powered by Apache Kafka, that offers high-throughput and low-latency and enables replication of data between regions in order to make decisions using a global streaming data processing platform powered by Apache Flink, ideally suited for complex and demanding use cases in the international bank such as customer notifications and fraud detection. These use cases require fast data processing and a business rules engine and/or models’ execution. Integrating these components together in a always-on, distributed architecture can be challenging.
This presentation addresses how ING approaches the challenge, the role that Apache Kafka and Apache Flink play, the omni-channel communication component and the event-driven architecture that enables ING to have a scalable and decoupled landscape. Also, we’ll have a brief overview of the use cases and you’ll learn why ING is becoming a real global event-driven bank.
Questions to the Panel of Speakers Delegate movement to the Seminar Rooms
Networking Lunch Served in the Exhibition Area
Conference Chair’s Afternoon Address
Becoming a Data Driven Organisation
Manuel de Francisco Vera, Director of Product Analytics, Elsevier
Creating a data-driven organisation requires more than just good data governance and analytics, it demands a fundamental change to the way your organisation acts and makes decisions. This presentation looks at the steps you can take to make this change by:
- Expanding operations and ensuring executive buy-in by demonstrating value
- Breaking through traditional processes to create a culture of evidence-based decision making
- Overcoming data silos to ensure analytic insights are shared and acted on at every level
How AI helps to allure customers to stay
Yuliya Sapega, Manager-Marketing Intelligence, AON
This presentation explores practical applications of Marketing Intelligence Modelling and applying them in the cloud to better retain and understand your customer base, as well as your growth potential.
• Using Predictive Modelling for churn events
• Classification vs Survival Analysis
• Setting up Machine Learning process in the cloud
• Do’s and don’ts for optimal ROI
How can the Role of Chief Data Officer Drive Innovation and Change?
Henk Munter, Chief Data Officer, Beslist.nl
The Chief Data Officer is critical to ensuring that value is extracted from enterprise data, data quality processes are robust and that data and insights are accessible to the entire organisation.
But as the CDO evolves from a data steward to the key figure responsible for digital transformation, business culture and monetisation, we explore the steps all CDO’s needs take in order to be a true force for innovation and change in the enterprise.
Questions to the Panel of Speakers
Afternoon Networking and Refreshments served in the Exhibition Area
Return on Investment (ROI) of Data Protection Compliance Engineering
Romeo Kadir, Vice President, European Association of Data Protection Professionals; President of the Board, European Institute for Privacy, Audit, Compliance & Certification
The financial benefits of compliance with data protection (GDPR) requirements do not only outweigh the financial costs of non-compliance for data scientists and practitioners but can overall be seen as an important business enabler for data science and data practices if the right conditions are met.
How is this done? It starts with the right mindset and a proper data protection engineering checklist of legal requirements and controls of which a generic version is presented and briefly discussed.
• Basic calculus of the ROI of data protection engineering
• How to engineer data protection in a big data related disruptive technology?
• How to engineer data protection in an AI related disruptive technology?
• What does a practical (no-nonsense) Data Engineering Compliance Management Plan look like?
• What contribution could the Data Protection Officer (DPO) make in data protection engineering?
When Cryptography meets Big Data Analytics
Aisling Connolly, Cryptography and Privacy Researcher, Information Security, Ecole Normale Superieure
The introduction of the GDPR and high-profile data breaches have thrown privacy into the spotlight on the data stage. Traditional business models are being disrupted, and many methods for data analytics are no longer plausible. It is difficult to see how to move forward in this age of information, where privacy must remain a core principle.
At the same time, what was once an innovation barrier, cryptography has seen major advances over the past decade. We are no longer confined to either seeing ‘all or nothing’ when looking at data, but instead, can define how much, or how little we want to see.
These cryptographic advances open up whole new avenues of exploration for data, and for privacy-preserving analytics. This talk will give a brief overview of such technologies.
• Cryptography is no longer a barrier to Innovation
• We can blindly go where no one has gone before
• Examples of Zero Knowledge Computing
• In terms of Big Data, what does this allow?
Closing Keynote: Is it Time for a Data Code of Ethics?
Marc Steen, Senior Research Scientist: Human-Centred Design; Responsible Innovation, TNO
As the world is rocked by the revelations of bulk data collection, behavioural tracking, and alleged manipulation of elections, this closing keynote asks what this means for people working in this industry.
How can data scientists contribute to responsible innovation?
We will consider several options, for example, a code of conduct; a set of guidelines; and compliance with legislation. How can one align one’s feelings and thoughts, and act responsibly? What would Aristotle do?
Questions to the Panel of Speakers
Conference Chair's Closing Address
Conference Closes and Delegates Depart
Whitehall Media reserve the right to change the programme without prior notice.