Be a pioneer, not just an employee.
We empower you with ownership and autonomy to solve challenging problems and drive tangible change, setting the pace for the industry.
Current Openings:
Site Reliability Engineering Architect:
Are you a visionary SRE leader ready to define the next generation of highly available, scalable, and resilient systems? We're looking for an SRE Architect to bridge the gap between development and operations, setting the strategic direction and technical standards for reliability across our entire product suite.
This is a chance to lead the implementation of AIOps and Chaos Engineering practices, fundamentally transforming how we build and run mission-critical, cloud-native services. You will be the ultimate champion for the SRE philosophy, ensuring operational excellence is baked into our architecture from day one.
Education & Experience: B.Tech, BE or MCA 10 - 20 Yrs
Roles and Responsibilities: Drive Operational Excellence
The SRE Architect will be responsible for incorporating the following modern technologies and practices: - Strategic Architecture & Automation (IaC/GitOps)
Cloud-Native Design: Lead the architectural design of Kubernetes-based microservices, ensuring optimal configuration for horizontal scaling, fault-tolerance, and cost efficiency across multi-cloud environments (e.g., AWS, GCP, Azure).
Infrastructure as Code (IaC): Standardize and govern infrastructure provisioning using modern Terraform and Pulumi modules, promoting immutable infrastructure and self-service capabilities.
GitOps Implementation: Drive the adoption of GitOps principles using tools like ArgoCD or Flux, ensuring all environment configurations and deployments are declarative, version-controlled, and auditable.
Advanced Observability & AIOps
Full-Stack Observability: Architect and deploy an enterprise-grade observability platform utilizing the "three pillars" (Metrics, Logs, Traces) with tools like Prometheus, Grafana, Jaeger, and OpenTelemetry.
AI/ML for Incident Management (AIOps): Spearhead the integration of Machine Learning models to transform incident response by:
Implementing predictive analytics to foresee system degradation before it impacts users.
Automating alert correlation and deduplication to combat alert fatigue.
Designing AI-driven self-healing mechanisms and automated runbooks for low-risk scenarios.
SLO Governance: Define, measure, and enforce strict Service Level Objectives (SLOs) and Error Budgets to balance reliability against feature velocity, directly tying engineering decisions to business outcomes.
Proactive Resilience & Security:
Chaos Engineering: Establish a formal Chaos Engineering program, utilizing tools like Chaos Monkey or Gremlin, to proactively test system resilience against controlled, real-world failures.
DevSecOps Integration: Embed security practices into the SRE lifecycle, ensuring configurations meet compliance standards and collaborating with Security teams on vulnerability management and access control policies.
Performance Engineering: Conduct large-scale load and stress testing, optimizing the performance of high-volume distributed systems to ensure readiness for peak usage and rapid growth. Apply
Site Reliability Engineering Consultant
We are seeking a highly experienced and pragmatic SRE Consultant to be the driving force behind our clients' operational transformation. This is a high-impact, client-facing role where you will not just execute, but strategize, design, and lead the adoption of world-class SRE principles, cloud-native architecture, and AIOps practices across diverse environments.
You will act as a trusted advisor, bridging the gap between engineering goals and business demands. If you are passionate about reducing toil, maximizing uptime, and mentoring teams to achieve hyper-scale reliability, you belong here.
Education & Experience: B.Tech, BE or MCA, 10 - 20 Yrs
Roles and Responsibilities: Advanced Technology & Strategic Impact
The SRE Consultant is expected to drive significant improvements by leveraging current and advanced technologies:
Strategic SRE & Observability Consulting
SLO Definition and Governance: Partner with business and engineering leaders to establish clear, measurable Service Level Objectives (SLOs), SLIs, and Error Budgets that directly align system reliability with customer experience.
Full-Stack Observability Design: Architect and implement advanced, centralized observability stacks using the OpenTelemetry standard and technologies like Prometheus, Grafana, and Jaeger to provide deep, actionable insights across distributed systems.
Performance and Resilience Audits: Conduct comprehensive reliability reviews and architecture assessments, providing actionable roadmaps for improving performance, scalability, and disaster recovery.
Cloud-Native & Automation Expertise
Kubernetes and Microservices Optimization: Consult on the optimal deployment, operation, and scaling of containerized workloads using Kubernetes (EKS/AKS/GKE), focusing on resource optimization and security best practices (e.g., service mesh with Istio/Linkerd).
Infrastructure as Code (IaC) & GitOps: Lead the transition to declarative infrastructure management using advanced patterns in Terraform or Pulumi, and implement continuous deployment pipelines based on GitOps principles (e.g., ArgoCD/Flux).
Toil Reduction and Automation: Identify and prioritize manual, repetitive tasks, designing and implementing Python/Go-based automation frameworks to dramatically reduce operational toil and free up engineering capacity.
Advanced Practices (AIOps & Chaos Engineering)
AIOps Implementation: Design strategies for incorporating Machine Learning into operations—including predictive alerting to anticipate failures, automated alert correlation to reduce noise, and AI-driven self-healing mechanisms.
Chaos Engineering Program Management: Establish and oversee Chaos Engineering initiatives using tools like Gremlin or Chaos Mesh to proactively validate system resilience under controlled, failure injection scenarios.
Mentorship and Enablement: Coach and mentor client SRE and DevOps teams on best practices, blameless post-mortems, and a shared culture of reliability and operational excellence.
Advanced Practices (AIOps & Chaos Engineering)
AIOps Implementation: Design strategies for incorporating Machine Learning into operations—including predictive alerting to anticipate failures, automated alert correlation to reduce noise, and AI-driven self-healing mechanisms.
Chaos Engineering Program Management: Establish and oversee Chaos Engineering initiatives using tools like Gremlin or Chaos Mesh to proactively validate system resilience under controlled, failure injection scenarios.
Mentorship and Enablement: Coach and mentor client SRE and DevOps teams on best practices, blameless post-mortems, and a shared culture of reliability and operational excellence.
DevOps / DevSecOps Architect:
Are you a master of the pipeline, ready to embed security and automation into every stage of the software lifecycle? We are looking for a DevOps/DevSecOps Architect to be the principal designer and evangelist for our unified development, operations, and security platform. This role is a catalyst for cultural and technical transformation.
You will design the blueprint for our next-generation CI/CD systems, championing immutable infrastructure, shift-left security, and GitOps practices. If you thrive on simplifying complexity and enabling rapid, secure delivery at hyper-scale, this is your opportunity to define the standard for operational excellence.
Education & Experience: B.Tech or BE / MCA / MSc, 10 - 20 Yrs
Roles and Responsibilities: Advanced Security & Automation
The DevOps / DevSecOps Architect will drive strategic initiatives utilizing the most current and advanced technologies:
DevSecOps Strategy and Architecture
Shift-Left Security: Integrate security validation tools (SAST, DAST, SCA) directly into the CI/CD pipeline, implementing automated vulnerability scanning for code, containers, and infrastructure as code (IaC).
Security Gate Design: Architect and enforce security quality gates, using policy engines like Open Policy Agent (OPA), to ensure compliance with regulatory standards and internal security baselines before deployment.
Secret Management: Design and govern enterprise-wide secret management solutions (e.g., HashiCorp Vault) to secure credentials, tokens, and keys used by applications and CI/CD tools
Advanced CI/CD and GitOps
Next-Gen CI/CD Platform: Design, build, and optimize highly resilient, scalable, and fully automated CI/CD pipelines using modern tools such as Jenkins (Declarative Pipelines), GitLab CI, or GitHub Actions.
GitOps Implementation: Drive the adoption of GitOps principles for managing all infrastructure and application deployments using tools like ArgoCD or Flux, ensuring a single source of truth and full auditability.
Binary Artifact Management: Architect efficient artifact management and caching strategies using tools like Nexus or Artifactory to accelerate build times and enhance supply chain security.
Cloud-Native and Immutable Infrastructure
Kubernetes Orchestration: Serve as the subject matter expert for Kubernetes deployments, designing cluster architecture, networking (Cilium/Calico), and cost optimization strategies across multi-cloud environments (AWS, GCP, or Azure).
Infrastructure as Code (IaC): Standardize the use of Terraform or Pulumi to manage all cloud resources, promoting an immutable infrastructure model and defining best practices for state management.
Container Security: Implement rigorous image scanning (e.g., Clair, Trivy) and runtime security enforcement for containerized applications, utilizing solutions like Falco for behavioral monitoring.
Monitoring and Observability
Unified Observability: Define the strategy for consolidating metrics, logs, and traces using the OpenTelemetry standard and implementing an advanced monitoring stack (e.g., Prometheus, Loki, Grafana).
AIOps Integration: Explore and integrate early AIOps practices for advanced alert correlation and predictive failure analysis to reduce noise and preempt service disruptions.
DevOps / DevSecOps Consultant:
We are searching for a highly skilled and strategic DevOps / DevSecOps Consultant to guide our clients through their transformation journeys. This is a crucial, client-facing role where you will be the expert advisor, designing and implementing secure, automated, and hyper-efficient software delivery pipelines.
You won't just recommend solutions; you'll architect the future state, embed "shift-left" security into development processes, and enable engineering teams to master cloud-native, GitOps, and SRE principles. If you are driven by the challenge of solving complex cultural and technical problems to achieve rapid, secure deployment at scale, join us to lead the change.
Education & Experience: B.Tech or BE / MCA / MSc, 10 - 20 Yrs
Roles and Responsibilities: Advanced Consulting & Technical Leadership
The DevSecOps Consultant will drive strategic impact by mastering and implementing the following advanced technologies and methodologies:
DevSecOps Strategy and Enablement
Security Shift-Left Strategy: Design and implement comprehensive DevSecOps roadmaps that embed security activities—including SAST, DAST, and Supply Chain Security—early into the CI/CD lifecycle for clients.
Policy-as-Code Governance: Consult on and deploy robust policy enforcement using tools like Open Policy Agent (OPA) to validate cloud resource configurations, container security, and compliance requirements across various environments.
Secret Management Architecture: Advise clients on securing credentials and access by designing and integrating enterprise-grade secret management solutions, primarily HashiCorp Vault
Cloud-Native & Advanced Automation
Kubernetes Transformation: Lead the architectural consulting for adopting and optimizing Kubernetes (EKS, GKE, AKS), advising on networking solutions (e.g., Service Mesh like Istio), cluster security, and cost management.
Infrastructure as Code (IaC) Mastery: Standardize and consult on large-scale Terraform or Pulumi implementation strategies for complex, multi-cloud infrastructure provisioning, promoting high reusability and maintainability.
GitOps Implementation: Design and lead the adoption of a GitOps workflow, leveraging tools like ArgoCD or Flux to ensure declarative, auditable, and automated application and infrastructure deployments.
CI/CD Pipeline and Observability
Pipeline Modernization: Architect highly available and performant CI/CD pipelines using modern platforms like GitHub Actions, GitLab CI, or Jenkins, with a focus on speed, reliability, and security gates.
Observability Consulting: Advise on the best practices for adopting OpenTelemetry and designing unified observability stacks (Prometheus, Grafana, Loki, Jaeger) to provide end-to-end visibility and improve incident response times.
Mentorship and Knowledge Transfer: Conduct workshops, hands-on sessions, and strategic meetings to upskill client engineering teams on SRE principles, blameless culture, and the practical application of DevSecOps tools.
Performance Engineering and Testing Architect:
Are you a visionary leader ready to transform system performance from an afterthought into a core architectural capability? We are seeking a Performance Engineering and Testing Architect to establish the strategy and technical standards for ensuring the speed, scalability, and resilience of our mission-critical applications.
This is more than just testing; it's about architectural consultation, system-level optimization, and embedding a culture of performance across the entire development lifecycle. You will leverage AI/ML techniques and cloud-native tooling to predict, prevent, and eliminate bottlenecks, making you the ultimate guardian of the customer experience and business success
Education & Experience: B.Tech or BE / MCA / MSc, 10 - 20 Yrs
Roles and Responsibilities: Advanced Strategy & Technical Command:
The Performance Engineering and Testing Architect will drive strategic initiatives utilizing the most current and advanced technologies:
Strategic Performance Architecture
Performance Engineering Roadmap: Define the enterprise-wide strategy for performance testing, monitoring, and optimization, shifting practices left to integrate performance validation into the earliest stages of design and development.
System Sizing and Capacity Planning: Consult on and design capacity models for high-volume, distributed, and cloud-native (Kubernetes) systems, ensuring cost-effective scalability and readiness for anticipated peak loads.
Performance Budgeting: Establish and enforce Performance Budgets and SLAs/SLOs across engineering teams, directly tying performance metrics to business outcomes and user experience goals
Advanced Testing and Tooling
Modern Load Generation: Lead the implementation and standardization of advanced, scalable load testing frameworks such as Gatling, k6, or Locust, designed to simulate realistic user behavior and high concurrency in modern microservices architectures.
Chaos Engineering Integration: Collaborate with SRE teams to integrate Chaos Engineering principles into performance testing, simulating fault injection scenarios (e.g., latency, resource starvation) to measure resilience under stress.
Cloud-Native Performance Tools: Drive the adoption of cloud-specific performance monitoring and tuning capabilities across platforms like AWS, Azure, or GCP, specifically utilizing services for autoscaling and serverless performance optimization.
Deep Observability and AIOps
End-to-End Observability: Architect and govern the Performance Observability stack, ensuring comprehensive coverage of Distributed Tracing (e.g., OpenTelemetry, Jaeger), detailed metrics (e.g., Prometheus/Grafana), and log analysis (e.g., ELK/Loki).
AI/ML for Performance: Explore and implement AIOps techniques for performance analytics, including using machine learning to predict performance degradation, identify anomalous behavior in production, and auto-tune system parameters.
Bottleneck Analysis: Direct deep-dive performance analysis, utilizing code profilers, heap analyzers, and network monitoring tools to pinpoint root causes of latency and throughput limitations in highly complex systems (Java, Go, Node.js).
Performance Engineering and Testing Consultant:
We are seeking an expert Performance Engineering and Testing Consultant to serve as a high-impact, client-facing advisor. This role is for a proactive problem-solver who can diagnose, optimize, and future-proof the performance of mission-critical systems across diverse organizations.
You will move beyond traditional testing to embed performance as a core architectural discipline, leveraging cloud-native tooling, advanced observability, and AI/ML techniques to deliver systems that are not just fast, but resilient and infinitely scalable. If you thrive on technical complexity and driving measurable improvements in speed and customer experience, join us to lead performance transformation.
Education & Experience: B.Tech or BE / MCA / MSc, 10 - 20 Yrs
Roles and Responsibilities: Advanced Consulting & Optimization
The Performance Engineering and Testing Consultant will drive client success by mastering and implementing the following advanced technologies and practices:
Strategic Performance Consulting
Performance Strategy & Roadmapping: Lead client engagements to define and execute comprehensive performance testing and engineering strategies, shifting performance validation left into the development pipeline.
Capacity Planning & Sizing: Advise on efficient capacity models for highly distributed, microservices and Kubernetes-based architectures, ensuring optimal resource utilization and cost control for multi-cloud deployments.
SLO Governance: Consult on establishing rigorous Service Level Objectives (SLOs) and Performance Budgets, ensuring engineering teams have clear, data-driven targets tied directly to user experience and business metrics
Next-Gen Performance Testing & Tools
Scalable Load Tooling: Design and execute highly realistic and scalable load, stress, and endurance tests using modern, open-source frameworks like k6, Gatling, or Locust, integrating them seamlessly into client CI/CD pipelines.
Chaos Engineering Integration: Guide client teams on integrating Chaos Engineering principles into their performance lifecycle, intentionally introducing faults (e.g., high latency, resource limits) to validate system resilience under pressure.
Test Environment Optimization: Define best practices for creating and managing cost-effective, production-like test environments, often leveraging Infrastructure as Code (IaC) (Terraform/Pulumi) for rapid environment spin-up and tear-down
Deep Observability and Diagnosis
Distributed Tracing & Monitoring: Architect and deploy advanced, unified observability solutions based on the OpenTelemetry standard, utilizing tools like Jaeger, Prometheus, and Grafana to provide deep, end-to-end performance visibility.
Bottleneck Analysis: Conduct expert-level analysis of code (e.g., using profilers), database queries, and network latency to pinpoint and recommend fixes for performance bottlenecks in complex application stacks (e.g., Java, Go, Python).
AIOps for Performance: Introduce concepts and initial implementations of AI/ML-based anomaly detection and predictive performance analytics to help clients transition from reactive monitoring to proactive incident prevention.
Director::
Are you a proven Executive Leader ready to define the future of operational excellence for a hyper-growth organization? We are seeking a Director to oversee and unite the strategic functions of Site Reliability Engineering (SRE), DevSecOps, and Performance Engineering.
This role demands a leader who can translate visionary strategy into measurable technical execution. You will be responsible for fostering a culture of blamelessness, continuous improvement, and security-first thinking while scaling our systems, our teams, and our practices across a global footprint. Your mandate is to drive world-class reliability, security, and velocity, ensuring our technical platform is a sustainable competitive advantage.
Education & Experience: B.Tech or BE / MCA / MSc, 10 - 22 Yrs
Key Accountabilities & Strategic Leadership
The Director will be responsible for setting the vision and governing standards across all critical engineering domains:
Executive Strategy & Organizational Transformation
Define the Unified Roadmap: Establish the 3-5 year technical strategy for Reliability, Security Automation, and Performance, ensuring alignment with overall business objectives, regulatory compliance, and cloud strategy (Multi-Cloud/Hybrid).
SLO-Driven Culture: Champion the adoption of Service Level Objectives (SLOs) as the primary mechanism for balancing feature velocity with stability, managing error budgets, and communicating service health to executive stakeholders.
Talent & Team Scaling: Recruit, mentor, and grow highly effective and diverse leadership teams (Architects, Managers, and Principal Engineers) across the SRE, DevSecOps, and Performance disciplines.
Financial Stewardship: Manage budgets for Cloud Infrastructure, tooling, and licenses, optimizing resource utilization and cost efficiency across Kubernetes and serverless platforms.
Advanced Architectural Governance
Security-First Architecture: Oversee the implementation of a comprehensive DevSecOps strategy, ensuring "shift-left" security is embedded in all CI/CD and GitOps pipelines, and governing the use of tools like HashiCorp Vault and Open Policy Agent (OPA).
Cloud-Native Resilience: Direct the architectural standards for Kubernetes and microservices, ensuring systems are designed for fault tolerance, auto-scaling, and operational consistency using Terraform/Pulumi (IaC).
AIOps & Observability Vision: Set the strategy for a unified, modern observability platform (leveraging OpenTelemetry, Prometheus, Grafana, Jaeger) and lead the exploration and implementation of AIOps for predictive alerting and automated remediation
Operational Excellence and Risk Management
Performance as Architecture: Govern the performance engineering strategy, driving the adoption of k6/Gatling and ensuring proactive Performance Budgeting and Chaos Engineering are standard practices, guaranteeing sub-second response times globally.
Incident Command: Own the highest level of the incident management process, ensuring effective communication, leading blameless post-mortems, and enforcing the subsequent architectural changes required to eliminate recurrence
Intern - Technologies::
Are you ready to transcend textbook learning and dive into real-world technological challenges? Our Technologies Internship is designed for high-potential students who are not just looking for a placement, but a launchpad. We are looking for individuals who are Strong in communication, possess exceptional Presentation skills, and have a passion for Innovation and Creativity.
You won't be fetching coffee; you'll be actively contributing to projects involving Advanced Technologies that are shaping the industry. This program offers direct mentorship, hands-on experience with cutting-edge tools, and the chance to transform innovative ideas into deployable solutions
Education & Experience: B.Tech or BE / MCA / MSc - Freshers
Key Focus Areas & Skills for Success
We seek candidates who demonstrate proficiency or a strong passion for the following core areas:
Modern Technical Foundation (Advanced Technologies)
In today's market, freshers are expected to have practical exposure to the tools and methodologies that drive modern software development. Your internship will focus on:
Cloud Fundamentals: Basic understanding or project experience with a major cloud provider (e.g., AWS, Azure, or GCP). This includes concepts like serverless computing (Lambda, Functions), containerization (Docker), and managed services.
DevOps/Automation Basics: Familiarity with version control (Git) and exposure to basic CI/CD concepts or scripting (e.g., Python, Shell) to automate tasks.
AI/ML & Data Science: Understanding of foundational machine learning concepts, simple model implementation (e.g., scikit-learn), or experience with large language models (LLMs) and generative AI concepts.
Essential Communication & Execution
Your ability to articulate ideas and drive projects is as critical as your code:
Strong in Communication: You must be adept at clearly conveying complex technical information to both technical peers and non-technical stakeholders.
Presentation Skills: Confidence and ability to structure and deliver persuasive and informative presentations to team leads and project managers about your progress and findings.
Innovation & Creativity: A proactive mindset for identifying inefficiencies and proposing novel, unconventional technical solutions to enhance products or processes
Current Market Expectations from Freshers
The industry is rapidly shifting toward more specialized and collaborative skill sets. We value interns who have:
Agile/Scrum Experience: Familiarity with Agile methodologies, sprint cycles, and collaborative tools (Jira, Trello).
Open Source Contribution: Experience contributing to or utilizing open-source projects demonstrates practical skill and community engagement.
Security Awareness: Basic understanding of security best practices (e.g., why not hardcode secrets, SQL injection prevention).
Intern - HR::
Are you passionate about people, process, and technology? Our HR Internship is a dynamic opportunity to gain hands-on experience in the critical functions that drive organizational success, focusing on talent acquisition, employee experience, and HR technology.
We seek an individual with a Polite and professional demeanor who possesses Strong communication skills to act as a vital link between our team, new hires, and internal clients. You will be instrumental in perfecting our Interview process, elevating Resource onboarding, and maintaining seamless Client communication. This is your chance to learn modern HR practices and contribute to a thriving workplace culture:
Education & Experience: B.Tech or BE / MCA / MSc - Freshers
Key Focus Areas & Modern HR Processes
Your internship will provide deep exposure to high-impact HR functions and advanced methodologies:
Talent Acquisition & Interview Process Excellence
Structured Interview Design: Assist in optimizing the candidate journey by analyzing current interview processes and helping to implement standardized, bias-reducing interview guides and rubrics.
Candidate Experience: Focus on maintaining high-touch, Polite and strong communication throughout the selection process, ensuring every candidate, regardless of outcome, has a positive impression.
Data-Driven Sourcing: Learn and utilize advanced sourcing tools (LinkedIn Recruiter, specialized databases) and analyze metrics (time-to-hire, source-of-hire) to improve recruiting efficiency.
Resource Onboarding & Employee Experience
Digitized Onboarding: Manage and streamline the Resource onboarding process using modern HRIS/HCM platforms (e.g., Workday, SAP SuccessFactors), focusing on automated workflows and a paperless experience.
First 90-Day Experience: Contribute creative ideas to enhance the new hire experience, ensuring new employees are quickly integrated into the culture and provided with necessary resources for productivity.
Employee Engagement Tech: Participate in analyzing and deploying engagement tools (e.g., pulse survey platforms) to gather real-time feedback and recommend actionable improvements.
Client & Stakeholder Communication
Internal Client Communication: Handle professional and articulate Clients communication (internal team leads and managers) regarding hiring updates, policy changes, and employee relations inquiries.
HR Automation: Assist in researching and implementing AI/Chatbot solutions for common HR inquiries (e.g., benefits, PTO requests) to improve efficiency and service delivery.
Compliance Documentation: Learn best practices for maintaining digital HR records and ensuring compliance with labor laws, often leveraging secure, cloud-based document management systems.
© 2025 EUROMOX. All rights reserved.
Drop a mail: info@euromox.io