High Performance Computing (HPC) System Administrator III

Date: Feb 2, 2024

Location: Savannah, GA, US

Company: Gulfstream Aerospace Corporation

High Performance Computing (HPC) System Administrator III in  GAC Savannah

Unique Skills:

**Hybrid work schedule available:**


Highly Desired Skills:

               HPC System Support: Scheduler Management, Code compilation, Large-scale multi-node application support

               Engineering Application Support: Applications consistent with development of mechanical and structural systems.


Additional Desired Skills:

               Configuration and Provisioning Management: Ansible, Satellite, Foreman

               Infrastructure Essentials: Apache, Mysql, DNS, DHCP, IPA, Monitoring, CIFS/Samba, NFS, iSCSI, FC

               Storage Experience: NAS and SAN Storage systems, Lustre, GPFS, VAST

               Virtualization: VMWare, RHEV

               Networking: Basic layer-2 network operations

               Data Center Operations: Physical system management



Education and Experience Requirements

Bachelor's Degree Engineering, Computer Science, Information Technology, or related curriculum or equivalent combination of education and experience sufficient to successfully perform the essential functions of the job. A Masters degree may be used to offset one year of experience requirement; PhD may offset two years of experience. 7 years experience required. in HPC or scientific computing environment to include the installation, configuration and maintenance of Redhat Package Manager (RPM)-based Linux distributions (RedHat, SuSE).

Position Purpose:

Contribute to the creation of the strategic objectives of the High Performance Computing environment in the Advanced Computing Technologies department, as well as develop operational plans, goals, and strategies to best serve the Computational Fluid Dynamics, Simulation, and Modeling Engineering business units. Provide technical oversight and operational support to other members of the team to ensure continued sustained functionality of the HPC environment in accordance with the operational goals and plans. Responsible for the optimum integration of scientific applications to HPC technology and the exploration of new HPC technology to better meet the needs of the business.

Job Description

Principle Duties and Responsibilities:

Essential Functions:
  1. Assume responsibility for the day-to-day operations of Gulfstream's production HPC cluster.
  2. Assist end users running applications on the HPC cluster.
  3. Provide third level support for end users who experience problems on engineering workstations and remote visualization systems.
  4. Manage, maintain, monitor, and control interactive and batch processes, both scheduled and unscheduled (including on-request processing).
  5. Ensure engineering-defined batch processing and backups are completed in the correct sequence and within the established time periods.
  6. Suggest improvements to processing capabilities and efficiencies through system tuning and other hardware and software optimizations and improvements.
  7. Perform regular monitoring of utilization needs and efficiencies, and reporton tuning initiatives.
  8. Perform proactive failure trend analysis and root cause analysis for all system failures.
  9. Produce trend reports to highlight production issues and follow predetermined action and escalation procedures when issues are encountered.
  10. Monitor, verify, and make appropriate adjustments to support proper application executions.
  11. Provide technical solutions that meet performance and processing objectives of the business areas.
  12. Perform upgrades that comply with corporate policies and industry best practices.
  13. Provide leadership to HPC Administrators during system upgrades and outages.
  14. Create thorough upgrade plans that comply with corporate policies and industry best practices.
  15. Assist in the introduction of new technologies that can provide greater capabilities, improved productivity and reduce total cost of ownership.
  16. Participate in the design of HPC technical solutions.
  17. Continuously evaluate efficiency, existing technology effectiveness and interoperability and suggest areas for improvement.

Additional Functions:
  1. Maintain technical relationships with multiple hardware and software vendors. .
  2. Work multiple operational windows as required. .
  3. Provide on-call support 24x7 .
  4. Assist in development and implementation of technical, hardware and software standards. .

Perform other duties as assigned.

Other Requirements:
  1. Experience with management of infiniband-based Linux-based HPC clusters, high performance parallel storage, and configuration and management of cluster scheduling software.
  2. Experience managing High Performance Computing low-latency, high-bandwidth interconnects.
  3. Experience supporting Linux based scientific workstations running visualization applications.

Additional Information

Requisition Number: 217320

Category: Information Systems

Percentage of Travel: Up to 25%

Shift: First

Employment Type: Full-time

Posting End Date: 02/29/2024 


Equal Opportunity Employer/Veterans/Disabled.



Gulfstream does not provide work visa sponsorship for this position, unless the applicant is a currently sponsored Gulfstream employee.


 Legal Information | Site Utilities | Contacts | Sitemap
Copyright © 2023 Gulfstream Aerospace Corporation. All Rights Reserved. A General Dynamics Company.

Gulfstream Aerospace Corporation, a wholly-owned subsidiary of General Dynamics (NYSE: GD), designs, develops, manufactures, markets, services and supports the world's most technologically-advanced business jet aircraft

Nearest Major Market: Savannah

Job Segment: Linux, Engineer, Computer Science, Aerospace, System Administrator, Technology, Engineering, Aviation