BrooklineRecruiter Since 2001
the smart solution for Brookline jobs

HPC Systems Admin

Company: TPA technologies
Location: Brookline
Posted on: June 17, 2022

Job Description:

Description: HPC Systems Administrator&... JOB DESCRIPTION HPC Systems Administrator Skills HPC Manage multi-vendor filesystems such as XFS and GPFS including upgrades, patching, space management, GPFS cluster management, diagnostics Manage workload schedulers torque and slurm including upgrades, patches, diagnostics, user resource consumption and configuration Manage Bright cluster management software including creating and maintaining images, managing job schedules and upgrades/patching Installation and configuration of hardware, operating systems, and commercial software packages managing user accounts, tuning system performance, installing system wide software and allocate mass storage space Systems Administration Advanced RHEL systems administration including hardware set up, upgrades, patching Remediation of vulnerabilities Performance tuning and server hardening Disk space management Diagnostics (slowness, nodes down, etc) Software experience Torque Slurm Bright Moab Mathlab Experience working with containers (docker, singularity, podman, kubernetes) a plus Experience in working with Git and supporting CI/CD pipelines a plus Job Responsibilities Installation, configuration, fine-tuning, and troubleshooting multi-vendor Linux HPC servers Building and deploying open source software and software from vendors/partners Diagnosing and resolving system operational problems quickly and effectively Verifying full operation of systems including network, systems and storage performance Configuration of the scheduling and queuing system Troubleshoot and maintain Infiniband and ethernet networks Understands, maintains, supports high performance parallel storage system Assists users/research team running applications on the HPC cluster Manage, maintain, monitor and control interactive and batch processes (scheduled and unscheduled) Requirements Expert knowledge of HPC server hardware including HP, Dell Expert knowledge of CentOS and Red Hat Expert knowledge of related parallel distributed file system like IBM GPFS Advanced knowledge of cluster storage systems including Isilon Advanced knowledge of the Linux Operating system such as: kernel compiles, boot up command line options, selinux, rpm, yum Advanced level of proficiency with NIS, NFS, autofs, TCP/IP, Linux network configuration, local storage, lm_sensors, ipmi required Intermediate knowledge of HPC resource Managers such as PBS, Torque and Moab Intermediate level of knowledge with Bash, Perl, PHP, awk, sed, grep, HTML Intermediate skill with scripting tools and leveraging solutions Ability to provide day-to-day 24 x 7 and participate in on-call rotation Must be able to lift and move 40lbs Hybrid preferred but open to remote

Keywords: TPA technologies, Brookline , HPC Systems Admin, Other , Brookline, Massachusetts

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest Massachusetts jobs by following @recnetMA on Twitter!

Brookline RSS job feeds