New York

917-310-0088 iPhone

selector@pipeline.com | www.selectorweb.com | www.LevSelector.com




  Ph.D., Experienced Data Scientist, Machine Learning & AI Researcher, Software Engineer

  More than 15 years of hands-on engineering and architecture experience working with data

  Ph.D. in Mathematical Modeling and Computer Simulations

  Lots of experience with high-volume data processing and calculations, big data feeds. Cloud (AWS EC2, Google BigQuery, IBM Watson), high-volume web scraping, Hadoop, databases (SQL based and key-value stores like Redis), high-volume ETL, Real Time Algorithmic Trading, Web Applications, and Business Intelligence & Analytics

  Proven track record of finding and implementing pragmatic solutions to complicated problems, designing new software architectures, and delivering them on time

  Experience with both Fortune 500 and small companies in the Finance, AdTech, e-Commerce, Publishing, and Marketing industries




  Machine Learning & Artificial Intelligence, NLP (Natural Language Processing), Data Science

  Mathematics and Finance: Ph.D. (Math. Modeling), Advanced Calculus, Probability and Statistics, Time Series Analysis, Numerical Methods, Machine Learning, Deep Learning

  Programming: Python, Pandas, NumPy,  SQL, ETL, algorithms, Perl, Javascript, C/C++, Excel VBA, Hadoop,  Java, Web Applications, CGI, XML, JSON. Unix & MS Windows, Cloud (Amazon AWS, Google, IBM)

  Project Management, Software Architecture




June 2017 – present – Selectorweb, consulting projects

Machine Learning, AI, Analytics, NLP (Natural Language Processing). Anaconda Python3, Scikit-Learn, NLTK, Pandas, Numpy, TensorFlow, Amazon AWS (EC2, S3), Google Cloud (BigQuery, MySQL), IBM Cloud (IBM Watson – Natural Language Understanding for Sentiment Analysis, Tone Analyzer, Personality Insights)

  Galvanize ( https://www.galvanize.com/new-york ) - made presentations for graduates about Deep Learning and AI

  Leadvisor ( https://www.leadvisor.io ) - built a system to automatically generate hierarchical reports (Excel and web)

  JKCF ( http://www.jkcf.org ) - Machine learning using Logistic Regression and NLP (Natural Language Processing) on highly imbalanced data. Data preparation, feature extraction/construction. Comparison of different approaches to solve “imbalanced data” problem, improved accuracy of the model


November 2014 – June 2017 - Penguin Random House, Consultant

Machine Learning and AI, Business Intelligence/Analytics, Big Data, Data Integration (ETL/ELT), Enterprise Reporting

Python (pandas, numpy, numba, cython), IBM/Netezza, SQL, Redis, Amazon Cloud (EC2 Linux)

  Taught Python language and analytics tools

  Organized "Deep Learning Book Club" to promote the use of Machine Learning and AI in-house

  Created Python framework and tools to work with IBM Netezza database

  Worked with several business groups to design workflows for multiple data feeds

  Created various data feeds and tools to clean and ingest millions of rows of data daily

  Configured AWS EC2 instances and wrote software for high-volume web scraping

  Wrote analytics tools for price analysis and estimation


April 2012 - April 2014 – AppNexus, Inc, Consultant, Financial Data Analytics

Responsible for designing and implementing systems for data extraction and aggregation, data analytics, billing, and reporting in a fast growing ad-tech company (AppNexus generates ~60 TB of data per day from its RTB platform)

Technologies: Linux, Python, Pandas, Vertica, Mysql, Hadoop, Hive, git

  Designed and implemented a new billing and reporting framework, which extracts, aggregates, and processes hourly trading data. Designing of new architecture required working with multiple teams (Data Team, Finance, Product, and Sales Operations)

  Redesigned monthly data extraction process (more than 30 Bln rows), cut extraction time from 3 hours to 7 minutes.

  Implemented the framework using Python Pandas libraries (more than 43 thousand lines of compact structured python/pandas code). Data comes from multiple sources (Vertica & MySQL databases, Excel and CSV files). Framework performs data filtering, aggregation, sorting, adjusting, and processing using different custom rules for different clients. Data is output into the database, and also into files for uploading into ERP system. Code is designed in a way to perform multiple automatic tests (including back-testing) to validate both data and code. Since creation, the code was responsible for more than 2 Bln dollars in billing (invoices and payments)

  Designed the cost & revenue reporting database. Designed intelligent self-recoverable ETL processes and health-monitoring processes, allowing automatic self-recovery after outages

  Designed business intelligence & analytics systems (multiple reports, graphs)

  Was responsible for daily operations of running billing and reporting. This included monitoring data loaders, collecting and validating the data and business requirements, running calculations, validating the numbers before importing them into ERP system. Resolving problems, answering customer’s requests

  Trained the team to use and maintain the billing/reporting systems and cost_revenue database. Documented the code and processes. Designed and taught courses on python/pandas and on billing procedures to the company’s employees


April 2010 – April 2012 – JPMorgan Chase & Co., Investment Banking, Consultant

Mortgage Analytics (Structured Products Group). Technologies: Unix/Linux, C/C++, Perl, Python, Sybase, Excel VBA

  Played a key role in migrating legacy C++ calculators (Fixed Income MBS) across data centers. Personally fixed and migrated ~100 applications, including old C/C++ applications, batch jobs, scripts used by modeling groups, and web applications. Wrote multiple Perl modules and applications (data feeds, utilities, monitoring, docs-builder system) to both help the migration and to streamline the nightly batch processes

  Instrumented analytical calculators to measure cpu time, instrument ids and categories, etc. These metrics would later be used to generate reports to identify how the resources were allocated

  Developed Excel VBA apps for regression analysis, and as a frontend for analytical calculators implemented in C++


2009 - WorldQuant, LLC. Trading Suport for algorithmic trading systems - Trading support. Created new strategies, analyzed their performance. Created monitoring tools. Used parameters such as position similarity, fillrate, imbalance, PnL, draw-down, sharpe ratio, slippage measure, etc. in monitoring. Resized strategies as needed. Resolved problems, generated custom reports. Ran simulations to back-test strategies. Unix, Perl, C++, MySQL, Excel VBA

2009 - Citigroup. Equity Financing. Consultant - Migration of perl/java/SQL data and analytics jobs from Sybase to MS SQL Server. Data cleaning and validation. Unix, Perl, C++, Java, Sybase, MS SQL Server

2007-2008 - HSBC. Asset Management Group. Consultant - Trading support: trade feeds and reports. PnL and risk reports. Analytics. Perl, Sybase 12.5, Oracle, Windows Server 2003, Control-M

2006-2007 - Merrill Lynch. Consultant - EFS project (Enterprise File System). Responsible for EFS operations. Unix, Perl, C/C++, Oracle, Sybase, NAS filers, NFS

2005-2006 - JPMorgan Chase. Consultant - Joint Distribution System (JDS). Database architecture, ETL,  database performance tuning, automation of testing/deployment. Unix, Perl, Java, Sybase

2004-2005 - CSFB, Prime Services. Programmer - Data Warehouse development and support. Products, pricing, risk, PnL. Perl, shell scripts, Sybase, CGI, Java servlets / Weblogic server

2000-2003 - Goldman Sachs. FICC (Fixed Income Currencies and Commodities). Consultant - Development and maintaining of the OE (Organizational Entities) system - a web-based application with database of all FICC clients, their accounts, business interests, people and contact information. Unix, Perl, C++, Java, jython, Sybase, DB2, Informatica

2000 - Morgan Stanley. Consultant - Web application for portfolio management. Technologies: Unix, Perl, Java, Sybase

1999-2000  Cantor Fitzgerald / Espeed. Programmer - Distributed Trading System (DTS). Added history system and deployment system. Created documentation website. Perl, mod_perl / cgi, Java, Sybase

1998-1999  Waterhouse Securities. Programmer – Developed brokerage website. Unix, Perl/cgi, web-design, Oracle

1994-1998  Infolink International, Inc. Project Lead - Web design for Business. Unix, Perl, HTML, Adobe Photoshop

1991-1994  Columbia University. Staff Associate - Mathematical modeling of dynamics of organic molecules. Programs were written in C and distributed to several unix computers in different labs, operations were controlled from one desktop Macintosh computer, results were fed from unix to Mac, custom programs (Mac C++ and IGOR software ) were used to automatically process the results

1981-1991 National Cardiology Research Center, Moscow, Russia. Researcher

  Real time data acquisition and computer processing in neuro-physiological experiments. Semi-automatic pattern recognition, categorizing of data. Computer simulations of nerve impulse generation and propagation along C-fibers. Partial differential equations, Hodgkin-Huxley model, Crank-Nicolson & modified Runge–Kutta methods

  Hardware and software design of medical equipment



  1988 Ph.D. in Biophysics (computer simulation of nervous coding), Graduate School of Moscow Physics and Technology Institute

  1981 MS in Electronics and Automation from Moscow Physics and Technology Institute.( www.mipt.ru ), Majors in computers, electronics and biophysics, Diploma - computer simulation of nerve activity



  Coursera Machine Learning & Deep Learning courses

  Data Analysis with Python and Pandas

  SEC Registered Representative ( Series 7, Series 63 )

  CQF ( Certificate in Quantitative Finance 6-month course )

  Advanced Object Oriented Perl (by Damian Conway)

  C++ for Quantitative Finance

  Advanced Excel for Financial Applications

  Java 2 (Sun)