Karan Singh
Karan Singh

Karan Singh

Research Software Engineer

Hello, I am an engineer who is dedicated to building practical and reliable systems for real-world scenarios — and who has learned that understanding from reading and understanding through doing can have a world of difference.

About Me

Mission & Motivation

I’m someone who likes and believes in building fast, efficient, and reliable software systems that are applicable on real world cases. As a graduate of Bachelors in computer science my interests span scientific computing, AI, and systems engineering. I'm currently a Research Software Engineer at IFIC Valencia, working on projects in collaboration with LHCb and ATLAS at the Large Hadron Collider (World's Largest Particle Accelerator at CERN). Being part of such opportunities has shown me what truly high-demand data systems look like, and it continues to shape how I think about performance and engineering.

Technical Experience

Before this, I gained experience across different domains — as an AI intern at Nihin Media K.K., where I built deployed machine learning tools, and through Google Summer of Code with CERN-HSF, where I helped improve scientific computing workflows requiring reliability and careful problem-solving. In the past few years, I've mainly focused on backend development, data science, and AI, building systems in C++ and Python from storage components to practical ML tools. I also learned data science and AI through reputable programs from IIT Delhi, IIT Madras, NVIDIA, IBM, and Meta, combining structured learning with hands-on work to approach problems with clarity and practicality.

Community & Leadership

I've also been active in my college's tech community. I served as the AI/ML domain lead for the Center for Innovation, Incubation and Entrepreneurship, where I mentored students, organized workshops, taught AI/ML fundamentals, and helped run hackathons to build a practical and collaborative learning environment.

Philosophy & Approach

I've always been a curious person, but I also believe it's equally important to stay practical and reliable in what I build. I often take the harder route and create things from scratch because it helps me understand them properly. I enjoy collaborating with people from different domains and believe that clear communication is key to solving complex problems effectively.

Featured Projects

Energy Profiling & Optimization at CERN-LHCb

I worked on analyzing the energy use of track-reconstruction software inside the ACTS C++ framework. I developed an energy-profiling module and used it to study the behavior of key fitting algorithms such as the Combinatorial Kalman Filter (CKF), Gaussian Sum Filter (GSF), and Runge–Kutta–based track propagation. This helped identify inefficiencies and provided clear insights to improve performance for LHCb and ATLAS workflows.

C++High-Performance ComputingData AnalysisProfiling
View on GitHub →

Google Summer of Code - Quantum Algorithm Benchmarking

I developed a Python framework that implements classical ML algorithms and simulates quantum ML algorithms on classical hardware. The goal was to estimate the energy use and performance researchers can expect when running the same processes on real quantum hardware, giving them a practical baseline before using actual quantum systems.

PythonPyTorchPennyLaneQuantum ComputingFramework Design
View on GitHub →

High-Performance Time Series Database Engine

Seeing the limits of general-purpose databases for time-series workloads, I set out to build my own storage engine. I designed a C++17 system with custom compression and a log-structured, time-sharded architecture, benchmarking it on an Intel i7-12700H, 16 GB DDR4 RAM, and NVMe Gen4 SSD, where it achieved ~50% better storage efficiency and sub-millisecond query times on real datasets.

C++DatabasesSystem DesignPerformanceDocker
View on GitHub →

Distributed Fault-Tolerant Cache System

To understand how distributed systems achieve reliability at scale, I built a fault-tolerant key-value store from scratch. Using consistent hashing, N-way replication, and automated chaos testing, I validated the system on an Intel i7-12700H, 16 GB RAM, NVMe Gen4 SSD, where it sustained 17k GET ops/sec with zero data loss, even during simulated node failures.

PythonDistributed SystemsgRPCAsyncioFault Tolerance
View on GitHub →

Technical Skills

Programming Languages

C++ Python SQL

Backend & Databases

Docker Kubernetes Kafka Redis PostgreSQL MySQL FastAPI Elasticsearch

Cloud & DevOps

Linux AWS (EC2, S3, Lambda) Azure CI/CD GitHub Actions

AI & Machine Learning

PyTorch Scikit-learn Pandas NumPy Transformers LangChain CUDA

Publications & Leadership

Technical Research

  • Sustainability studies of big data processing...
    Co-authored poster at ACAT 2025.
  • Improvements on QAOA for Particle Trajectories...
    Co-authored poster at ACAT 2025.

Leadership

  • AI/ML domain lead, mentoring 100+ students and organizing 15+ workshops.
  • Secured 3rd place at SRM Builds 5.0 & 6th place at Intel's AI Hackathon.