Livestream’s mission is to connect people and live events. Livestream offers event owners a complete set of hardware and software tools to share their events with a growing community online. More than 30 million viewers each month watch thousands of live events from customers including The New York Times, Facebook, ESPN, SpaceX and Warner Bros. Records. Founded in 2007, Livestream is headquartered in New York with offices in Los Angeles, Ukraine and India. Livestream.com is the destination for the top live content from around the world.
Livestream featured in the list of Forbes top 100 America’s Most Promising Companies 2012 and ranked 12th in Inc Magazine fastest growing private companies in the US .
Livestream also won the Streaming Media Reader’s Choice Award in multiple categories like Best Online Video Technology Company, Best Field Encoder, Best Live Video streaming service and Best Online Video Platform in 2012.
- Work in 24x7 rotational shifts: morning (6:00am-2:30pm), afternoon (2:00pm-10:30pm), and night (10:00pm-6:30am).
- Monitor, Provision, configure and troubleshoot our production servers and services.
- Maintain 100% uptime of the production services.
- Assist SRE team to Optimize / tune our servers and services for performance, scalability, and maintainability.
- Ensure that our monitoring tools catch and generate alerts on all production issues.
- Resolve issues reported by our monitoring tools, including following through on long-term issues.
- Follow escalation process through issue completion, including providing documentation after resolution.
- Supervise the junior system administrators to ensure that they are following procedures and completing tasks successfully.
- Assist in root cause analysis of production issues and provide report which includes recommendations for identifying future issues more quickly as well as preventing future failures entirely, whether through process or technology improvements.
- Send periodic NOC reports to managers with the system and service status.
- Manage backups and disaster recovery, including backup monitoring and verification, and leading restoration tests and disaster recovery drills.
- Become a technical escalation point during your shift.
Required skills and experience:
- 6+ years experience supporting a real-time 24x7 production internet-based web environment.
- Strong written and verbal communication skills; Troubleshoot technical issues by ApplicationLog/SystemLog Analysis; Ability to organize and prioritize tasks.
- Experience training and mentoring more junior members of the team, and working with other departments to solve cross-departmental problems.
- Experience of unix scripting: shell, perl, python, ruby or equivalent languages (from an automation and monitoring standpoint).
- Ability to identify and configure add-on modules or plugins of open-source tools to effectively automate tasks and monitor production services.
- Strong knowledge of DNS, system build automation, and system configuration management tools.
- Familiarity with server virtualization technologies (Xen or equivalents).
- Experience with configuring and tuning monitoring systems (Nagios, Graphite or equivalents).
- Ability to work in fast-paced environments with weekly release schedules.
Desired skills and experience:
- Knowledge of redis or other nosql databases.
- RHCE certification or equivalent experience.
- Experience with package management (preferably on Debian systems).
- Working knowledge of the following technologies (or equivalents): ldap, ntp, dns, fai, dhcp, subversion, git, chef, mysql.
- Self Starter.