Speakers Agenda Why attend Partners Venue

Locations▼

Get invited

Request to partner

Get your pass

Call to action

Your text goes here. Insert your content, thoughts, or information in this space.

Button

Back to speakers

Julia

Cherashore

Adjust Professor

Fordham University

Julia Cherashore is currently Adjunct Professor at Fordham University’s Gabelli School of Business, a Senior Fellow at the Data Foundation, and an Advisor to several AI start-ups. She’s a 20+ year veteran of the financial services industry, having held progressively senior data roles at Morgan Stanley, Visa and New York State Department of Financial Services (NYS DFS). She was a first-in-role Chief Data Officer (Deputy Superintendent-Data Governance Management) at NYS DFS, with her work recognized with a ‘Best of New York’ Award for ‘Best Workplace Initiative’ among NYS government. Previously, she spent 20 years at leading investment banks and financial services firms, including Morgan Stanley, Goldman, Sachs & PwC, Visa, and Moody’s across Data, Risk Management, Compliance, Management Consulting, and Operations. Julia is a frequent speaker at prestigious data & AI events, including Reuters Momentum AI, SuperReturn, CDO Vision, Corinium Intelligence, CDOIQ Symposium, GovTech, BFSI Nexus, and others. She received an MBA from NYU Stern School of Business and dual undergraduate degrees in Music and Business Administration from Weber State University.

Button

04 June 2026 16:30 - 17:00

Panel | What breaks when GenAI scales: Latency, cost, and reliability in the real world

As GenAI adoption grows, the underlying infrastructure is under constant strain. Latency spikes, unpredictable traffic, rising inference costs, and brittle retrieval layers often emerge long after a system looks stable in testing. This session explores how engineering teams are redesigning serving layers, data pipelines, and performance workflows to keep GenAI systems fast, affordable, and reliable at scale. Key takeaways: → How teams reduce latency under real-world load → The cost impact of routing, batching, and caching decisions → Where retrieval layers and vector search introduce scaling limits → Architectural choices that improve reliability as usage grows