We seek candidates for positions in our AI compiler in code generation, network-level optimization, and operator implementation. Applicants are not required to be proficient in more than one area and are not expected to be able to take on all the responsibilities below.
Responsibilities
1. Make generic implementations of ML operators suitable for a library
2. Implement the generation of codes of ML operators optimized for a set of size constraints and data alignments in assembly code
3. Design and implement communications and synchronizations in a massively parallel chip
4. Determine the best parallelization approach for an operator given a problem size and the characteristics of the target architecture
5. Implement codes to detect at the IR level combinations of operators that can be optimized and implement the corresponding IR transformations
6. Design and implement strategies to manage the movement of data through the cache hierarchy
7. Design and implement operator scheduling algorithms
8. Analyze the performance of ML networks at operator and cycle levels
9. Work actively on architectural requirements for next-generation ML chips
10. Work jointly with other groups, including data science, architecture, verification, and RTL teams, on testing and optimizing the performance of ML networks
Minimum Qualifications
1. 2+ years of experience on either ML compilers, general-purpose compilers, high-performance library implementations, high-performance parallel runtimes, or the parallelization of codes
2. Knowledge of software design and engineering, including object-oriented programming and design patterns
3. Knowledge of testing methodologies and continuous integration
4. Ability to write assembly code, preferably for RISC-V and the RISC-V Vector Extension
5. Experience in writing object-oriented C++ codes
6. Knowledge of development tools, including extending cmake files, using git, and working with CI/CD workflows
7. BS in Computer Science or a related technical field
8. Business fluent English
Desired Qualifications
1. Strong knowledge of computer architecture and parallelization
2. Experience in the implementation of ML operators
3. Experience in compiler optimizations, IR manipulations, and lowering into instructions
4. Knowledge about vectorization
#J-18808-Ljbffr