New OCI (Oracle Cloud Infrastructure) Compute instances accelerated by NVIDIA L40S now available to order
By Rohil Bhargava and Dave Salvator
Enterprises are rapidly adopting generative AI, large language models (LLMs), advanced graphics and digital twins to increase operational efficiencies, reduce costs and drive innovation.
However, to adopt these technologies effectively, enterprises need access to state-of-the-art, full-stack accelerated computing platforms.
To meet this demand, Oracle Cloud Infrastructure (OCI) today announced NVIDIA L40S GPU bare-metal instances available to order and the upcoming availability of a new virtual machine accelerated by a single NVIDIA H100 Tensor Core GPU. This new VM expands OCI’s existing H100 portfolio, which includes an NVIDIA HGX H100 8-GPU bare metal instance.
Paired with NVIDIA networking and running the NVIDIA software stack, these platforms deliver unmatched performance and efficiency, enabling enterprises to advance generative AI.
NVIDIA L40S Now Available to Order on OCI
The NVIDIA L40S is a universal data center GPU designed to deliver breakthrough multi-workload acceleration for generative AI, graphics and video applications. Equipped with fourth generation Tensor Cores and support for the FP8 data format, the L40S excels in training and fine-tuning small- to midsize LLMs and in inference across a wide range of generative AI use cases.
For example, a single L40S (FP8) can generate up to 1.4x more tokens per second than a single NVIDIA A100 Tensor Core GPU (FP16) for Llama-3-8B with NVIDIA TensorRT-LLM at an input and output sequence length of 128.
The L40S also has best-in-class graphics and media acceleration. Its third-generation NVIDIA Ray Tracing Cores (RT Cores) and multiple encode/decode engines make it ideal for advanced visualization and digital twin applications.
L40S delivers up to 3.8x the real-time ray-tracing performance of its predecessor, and supports NVIDIA DLSS 3 for faster rendering and smoother frame rates. This makes the GPU ideal for developing applications on the NVIDIA Omniverse platform, enabling real-time, photorealistic 3D simulations and AI-enabled digital twins.
With Omniverse on L40S, enterprises can develop advanced 3D applications and workflows for industrial digitalization that will allow them to design, simulate and optimize products, processes, and facilities in real time before going into production.
OCI will offer the L40S GPU in its BM.GPU.L40S.4 bare-metal compute shape, featuring four NVIDIA L40S GPUs, each with 48GB of GDDR6 memory. This shape includes local NVMe drives with 15.36TB capacity, 4th Generation Intel Xeon CPUs with 112 cores, and 1TB of system memory.
These shapes eliminate the overhead of any virtualization for high-throughput and latency-sensitive AI or machine learning workloads to deliver strong bare-metal performance. The accelerated compute shape features the NVIDIA BlueField-3 DPU for improved server efficiency, offloading data-center tasks from CPUs to accelerate networking, storage and security workloads. The use of BlueField furthers OCI’s strategy of off-box virtualization across its entire fleet.
OCI Supercluster with NVIDIA L40S enables ultra-high performance with 800Gbps of internode bandwidth and low latency for up to 3,840 GPUs. OCI’s cluster network uses NVIDIA ConnectX-7 NICs over RoCE v2 to support high-throughput and latency-sensitive workloads.
“We chose OCI AI infrastructure with bare metal instances and NVIDIA L40S GPUs for 30%
more efficient video encoding. Videos processed with Beamr Cloud on OCI will have up to 50% reduced storage and network bandwidth consumption, speeding up file transfers by 2x and increasing productivity for end-users. Beamr will provide OCI customers video AI workflows, preparing them for the future of video” –Sharon Carmel, CEO, Beamr Cloud
Single-GPU H100 VMs Coming Soon on OCI
The VM.GPU.H100.1 compute virtual machine shape, accelerated by a single NVIDIA H100 Tensor Core GPU, is coming soon to OCI. This will provide cost-effective on-demand access for enterprises looking to use the power of NVIDIA H100 GPUs for their generative AI and HPC workloads.
A single H100 provides a good platform for smaller workloads and LLM inference. For example, one H100 GPU can generate more than 27,000 tokens per second for Llama 3 8B (up to 4x more throughput than a single A100 at FP16 precision) with NVIDIA TensorRT-LLM at an input and output sequence length of 128 and FP8 precision.
The VM.GPU.H100.1 shape includes 3.4TB of NVMe drive capacity, 16 cores of 4th Gen Intel Xeon processors and 14GB of system memory, making it well-suited for a range of AI tasks.
“Oracle Cloud’s bare metal compute with NVIDIA H100 and A100 GPUs, low-latency Supercluster, and high-performance storage delivers up to 20% better price-performance for Altair’s computational fluid dynamics (CFD) and structural mechanics solvers. We look forward to leveraging these GPUs with virtual machines for the Altair Unlimited virtual appliance.” Yeshwant Mummaneni, Chief Engineer, Data Management and Analytics, Altair
GH200 Bare-Metal Instances Available for Validation
OCI has also made available the BM.GPU.GH200 compute shape for customer testing. It features the NVIDIA Grace Hopper Superchip and NVLink-C2C, a high-bandwidth 900GB/s connection between the NVIDIA Grace CPU and Hopper GPU. This provides unified cache coherence and over 600GB of accessible memory, enabling up to 10x higher performance for applications running terabytes of data compared to NVIDIA A100.
Optimized Software for Enterprise AI
Enterprises have a wide variety of NVIDIA GPUs to accelerate their AI, HPC and data analytics workloads on OCI. However, maximizing the full potential of these GPU-accelerated compute instances requires an optimized software layer.
NVIDIA NIM, part of the NVIDIA AI Enterprise software platform available on the OCI Marketplace, is a set of easy-to-use microservices designed for secure, reliable deployment of high-performance AI model inference to deploy world-class generative AI applications.
Optimized for NVIDIA GPUs, NIM pre-built containers offer developers improved cost of ownership, faster time to market, and security. NIM microservices for popular community models, found on the NVIDIA API Catalog, can be deployed on OCI.
Performance will continue to improve over time with the monthly cadence of NIM releases and with upcoming GPU-accelerated instances, including NVIDIA H200 Tensor Core GPUs and NVIDIA Blackwell GPUs.
Diebold Nixdorf lança DN Cloud Retail para gestão integrada do varejo
AbraCloud realiza a primeira pesquisa sobre o perfil dos provedores de nuvem nacionais
INTERNATIONAL NEWS
Crypto ID publishes international articles about information security, digital transformation, cyber security, encryption and related topics.
Please check here!
Acompanhe os principais conteúdos sobre Cloud!
Cadastre-se para receber o IDNews e acompanhe o melhor conteúdo do Brasil sobre Identificação Digital! Aqui!