Employment Type: Full-time, Permanent
About Kinisi Robotics
We build intelligent humanoid robots for real-world manipulation tasks. Our KR1 platform is a mobile bi-manual humanoid with dual 7-DOF arms, multiple gripper types, force/torque sensing, and high-performance onboard compute. Unlike many robotics companies, our robots are designed to perform useful work in real customer environments. We are building a complete robot intelligence stack that combines state-of-the-art robot learning with mature robotics software, perception, and control systems.
Our robotics software teams provide the perception, tracking, state estimation, safety, planning, and Cartesian control infrastructure that enables learned policies to operate reliably on real hardware. In this architecture, Vision Language Action (VLA) models and large behaviour models act as the robot’s decision-making layer, while the underlying robotics stack provides safe and robust execution The robot learning systems developed in this role will directly influence how future generations of humanoid robots acquire new skills. You will work at the intersection of modern AI and physical robotics, helping transform state-of-the-art robot learning techniques into capabilities that perform useful work in real customer environments.
About the Role
We are looking for a Senior Robot Learning Engineer to develop, train, and deploy large behaviour models and Vision Language Action (VLA) systems for advanced bi-manual mobile manipulation. This is an applied robotics role focused on real-world deployment. Success is measured by robot capability, robustness, and customer value rather than publication count.
You will join a team where much of the hard infrastructure work has already been completed. We have production robots, a mature robot-learning pipeline, cloud training infrastructure, teleoperation-based data collection, onboard deployment tooling, and dedicated robotics, perception, and infrastructure teams. Your mission is to improve the intelligence running on our robots: scaling our most promising policy architectures, developing multi-task manipulation capabilities, improving generalisation, and deploying learned behaviours that perform useful work in production environments.
You are not joining to build the infrastructure around robot learning; you are joining to improve the intelligence running on robots that already exist.
Existing Infrastructure
Kinisi has already built the core infrastructure required for large-scale robot learning. You will join a team with:
- Production humanoid robots operating in real-world environments.
- A VR-based teleoperation system for rapid demonstration collection.
- Large and growing manipulation datasets collected on physical robots.
- A mature cloud-based training pipeline.
- Automated workflows that train models in the cloud and deploy them directly onto physical robots.
- NVIDIA Jetson Thor compute onboard every robot for high-performance inference.
- Dedicated robotics software, perception, and infrastructure teams.
- Significant cloud GPU resources for large-scale training and experimentation.
This role is focused on improving robot intelligence rather than building the surrounding infrastructure from scratch.
What Success Looks Like
After 12 months you will have:
- Deployed learned manipulation policies onto production KR1 robots.
- Increased the number of customer tasks that can be automated through robot learning.
- Improved manipulation success rates, robustness, and generalisation across real-world environments.
- Established repeatable workflows for developing and deploying new robot capabilities.
- Delivered measurable improvements in robot performance that create customer value.
Our Approach to Robot Intelligence
Kinisi is not currently developing frontier-scale foundation models from scratch. Instead, we focus on building practical robot intelligence systems that can be trained rapidly, deployed efficiently, and continuously improved using data collected from real-world operation.
Our approach combines:
- Vision Language Action models.
- Large behaviour models.
- Specialised manipulation policies.
- Agentic decision-making systems that select between capabilities.
- Classical robotics perception and control systems.
This allows us to iterate quickly, train new capabilities in days rather than months, and continuously deploy improvements onto production robots. We believe the fastest path to capable humanoid robots is a practical deployment-focused approach rather than pursuing ever-larger models.
Technical Environment
- Policy architectures: Modern, state-of-the-art architectures including those based on attention mechanisms, diffusion models, and large visual language models (VLMs) adapted for robot control. Focus on multi-task, language-conditioned policies and policy fine-tuning.
- Training: Standard ML toolchain utilizing a leading deep learning framework, configured for cloud-based GPU orchestration, experiment tracking, and configuration management.
- Data: Extensive datasets covering various robot learning paradigms, including multi-modal sensor streams (visual, depth, force, proprioceptive) collected via high-frequency teleoperation and facilitated by robust annotation and processing pipelines.
- Robot platform: Advanced, general-purpose humanoid robot platform featuring high-DOF manipulation capabilities, diverse end-effectors, precise Cartesian control, and integrated force sensing and on-board compute.
- Perception: A mix of established and cutting-edge vision models (e.g., foundation models for perception, advanced instance/semantic segmentation, and state estimation), coupled with a high-speed, GPU-accelerated pipeline for 3D environment mapping and object detection.
- Inference: Highly-optimized, on-robot inference leveraging framework-native compilation and advanced GPU techniques (e.g., CUDA graphs) to achieve time-optimal action execution within a standard robot control environment interface.
- Simulation: Diverse simulation environments ranging from high-fidelity physics simulators to high-realism rendering engines and standardized benchmarks, used for rapid iteration and integration testing.
- Observation space: Rich multi-modal observation space encompassing full robot proprioception, multi-view visual inputs, end-effector state representations, and processed environment features.
- Action space: High-level, coordinated bi-manual control in a Cartesian space (position/orientation and gripper commands).
Core Responsibilities
- Policy Architecture & Training
- Architect, train, and evaluate end-to-end large behaviour models for bi-manual and mobile manipulation tasks.
- Scale and evolve our most promising architectures: advance diffusion transformer policies, mature VLA integration, and develop language conditioning for true multi-task generalisation.
- Extend the imitation learning pipeline to leverage growing demonstration datasets collected via VR teleoperation.
- Apply reinforcement learning to refine pre-trained policies beyond what imitation alone can achieve, using approaches such as RL token fine-tuning, residual RL, off-policy RL with reference-action regularisation, and RL-based fine-tuning of diffusion policies.
- Develop robust manipulation policies for contact-rich, real-world tasks, leveraging reinforcement learning where appropriate to improve performance beyond imitation learning alone.
Generalisation & Scaling
- Develop policies that generalise across diverse manipulation tasks, object categories, and environments. Move from single-task models toward multi-task and task-conditioned architectures.
- Design hierarchical behaviour systems for complex, long-horizon manipulation sequences, complementing the existing behaviour orchestration layer.
- Investigate data-efficient learning: few-shot adaptation, transfer learning across tasks, and systematic use of multi-dataset training.
- Drive systematic ablation studies across policy architectures to identify which approaches best suit different manipulation skill classes.
Sim-to-Real & Deployment
- Build a systematic sim-to-real transfer pipeline. Simulation infrastructure exists but is not yet connected to training. Develop domain randomisation, rendering augmentation, and sim-to-real benchmarking workflows.
- Deploy and iterate learned policies on physical robot hardware.
- Collaborate with robotics software engineers to extend the Gymnasium environment wrapper (observation/action spaces, reward signals) and improve integration with the robot’s control stack.
- Work with the perception team to leverage visual representations (keypoints, learned features, 3D point clouds) for policy conditioning.
Required Qualifications
- MSc, PhD, or equivalent industry experience in Robotics, Machine Learning, Computer Science, or a related field.
- Demonstrated experience training and deploying learned manipulation policies on physical robot hardware.
- Strong background in at least two of:
- Behaviour cloning
- Diffusion policies
- Vision Language Action models
- Reinforcement learning for manipulation
- Experience with PyTorch and modern deep learning workflows.
- Strong Python software engineering skills.
- Ability to design experiments and evaluate real-world robot performance.
Preferred Qualifications
- Hands-on experience with humanoid or bi-manual manipulation platforms.
- Experience with diffusion transformer, ACT, or VLA architectures specifically.
- Background in leveraging pre-trained vision or language models (CLIP, DINOv2, PaliGemma) for robotic control.
- Experience with MuJoCo, Isaac Sim, or ManiSkill for sim-to-real policy training.
- Experience with RL fine-tuning of pre-trained policies (residual RL, DPPO, or similar).
- Knowledge of 3D perception for policy conditioning (point clouds, keypoints, NeRFs).
Who you’ll work with
- Applied Research Scientists (ML + Perception): developing perception systems, reinforcement learning techniques, and robot learning architectures that improve robot capability.
- Senior AI Platform Engineer: building the platform that powers dataset management, training, evaluation, deployment, and fleet rollout.
- Machine Learning Engineers: hardening validated architectures into reliable production systems and optimising deployment on robot hardware.
- Robotics Software Engineers: providing the perception, planning, safety, and Cartesian control systems that enable learned policies to operate reliably on physical robots.
- Real Robots: access to a growing fleet of KR1 humanoid robots operating in real-world environments, with multiple robots available in-house for rapid experimentation, testing, and deployment.
- End-to-End Ownership: because Kinisi develops its own robot hardware, end-effectors, teleoperation systems, and data collection infrastructure, you can rapidly prototype new ideas and see them deployed on real robots.
Why Join Kinisi?
Most robot-learning roles require spending years building infrastructure before meaningful deployment is possible.
At Kinisi, the robots exist. The training pipeline exists. The deployment infrastructure exists. The data collection pipeline exists.
Your focus will be making robots smarter and deploying those improvements onto real humanoid robots performing useful work in the real world.
We offer competitive salary and equity, comprehensive health cover, conference opportunities, excellent office space, and a highly collaborative engineering culture.
Apply for this role – Please include a CV, your LinkedIn profile and which job you’re applying for.