Teradar develops sensors that produce high-resolution, all-weather 4D imaging to improve object detection and situational awareness. The company builds solid-state terahertz imaging sensors using…
Teradar develops sensors that produce high-resolution, all-weather 4D imaging to improve object detection and situational awareness. The company builds solid-state terahertz imaging sensors using…
Your next opportunity is in here somewhere. Sign up to explore 70,000+ startups and their open roles. No spam. No gamification. Just jobs.
70,000+
Startups
81,000+
Open Roles
4,000+
New This Week
AI Researcher
InternshipTel Aviv
Internship • Tel Aviv
Technical Writer
ContractJerusalem
Contract • Jerusalem
Backend Developer
InternshipMunich, DE
Internship • Munich, DE
Product Designer
Full-timeUtrecht, NL
Full-time • Utrecht, NL
Data Scientist
Part-timeMunich, DE
Part-time • Munich, DE
AI Researcher
Full-timeCambridge, GB
Full-time • Cambridge, GB
At Teradar, we are pioneering a new era in perception with the world’s first automotive terahertz vision sensor, delivering ultra-high-resolution imaging in any weather condition. Founded in Boston, Teradar’s solid-state, chip-scale technology unlocks safer, smarter vehicles and opens the door to transformative applications in mobility, defense, and beyond.
We are looking to hire a DSP Software Developer:
experience with multicore optimization, a strong understanding of memory hierarchy considerations in heterogeneous SoCs.
Someone capable of partitioning real-time signal processing pipelines across heterogeneous DSP cores, squeezing every cycle out of shared-memory hierarchies, and orchestrating data movement over a Network-on-Chip.
You’ll work close to the metal on both fronts: scaling workloads across many cores and optimizing the hot inner loops on each one.
Location: Boston MA
Responsibilities
Architect and implement multicore software for radar signal processing on an SoC, partitioning pipelines across multiple cores connected by a network-on-chip.
Design data and task decomposition strategies that balance compute load, minimize inter-core communication, and exploit pipeline, data, and functional parallelism across radar processing stages.
Manage a multi-level memory hierarchy (core-local, cluster-shared, and SoC-global) - placing buffers, sizing working sets, and orchestrating DMA transfers to sustain high memory throughput and keep cores fed with radar data cubes while hiding stalls behind useful work.
Develop and optimize per-core radar kernels (FFTs, filters, matrix operations, CFAR variants, MIMO processing) using SIMD, VLIW, fractional arithmetic, and intrinsics.
Skills & Experience
Strong experience developing multicore embedded software on an SoC, including workload partitioning, scheduling, and load balancing across cores.
Hands-on experience managing shared and distributed memory across a multi-level memory hierarchy, including explicit DMA-driven data movement, double/multi-buffering, and techniques for sustaining high memory throughput under real-time constraints.
Build, use, and maintain pre-silicon validation platforms such as virtual prototypes for early multicore software development, performance projection, and testing.
Profile end-to-end radar pipelines across cores - identifying load imbalance, NoC contention, memory bandwidth bottlenecks, and synchronization overhead - and iterate on partitioning, scheduling, and data layout to optimize performance, power, and area trade-offs.
Working knowledge of bare-metal programming and/or real-time operating systems, including boot flow, linker scripts, memory maps, interrupt and exception handling, and real-time task scheduling.
Solid understanding of computer architecture and micro-architecture fundamentals.
Proficiency in C/C++ along with SIMD and VLIW programming models, intrinsics, and fractional arithmetic applied to radar or DSP kernels.
Familiarity with radar signal processing concepts - FMCW radar, Range/Doppler/Angle estimation, FFTs, CFAR detection, beamforming, MIMO, and tracking, and the data-flow and bandwidth characteristics they impose on the processing pipeline.
Exposure to virtual prototypes or pre-silicon validation platforms.
Ability to analyze and resolve performance bottlenecks spanning compute, memory bandwidth, NoC, and synchronization, and to optimize for PPA across the full multicore radar pipeline.