Google sharpens AI toolset with new chips, GPUs, more at Cloud Next

Trending 4 weeks ago

Cloud Next Google is razor-sharp-focused connected AI astatine this year's Cloud Next, pinch a slew of hardware projects, including TPU updates, GPU options, and a bevy of package devices to make it each work.

At nan first in-person type of nan arena since earlier nan pandemic, held successful nan monolithic Moscone Center successful San Francisco, Google fto loose item connected its Cloud TPU v5e, nan latest of its Tensor Processing Unit AI accelerators, positive virtual instrumentality instances powered by Nvidia H100 GPUs.

TPUs are Google's civilization silicon for accelerating instrumentality learning, and nan Cloud TPU work is based astir nan company's ain TensorFlow instrumentality learning model successful summation to different frameworks, including Jax and PyTorch.

Its erstwhile AI chip, TPU v4, was officially released successful 2021, though nan hunt elephantine had been testing it for a respective years prior.

With Cloud TPU v5e, Google is claiming to person doubled nan training capacity per dollar and 2.5 times nan conclusion capacity per dollar connected ample connection models (LLMs) and generative AI, erstwhile compared pinch Cloud TPU v4.

The unreality elephantine uses TPUv4 engines to do conclusion for its ain hunt motor and advertisement serving platforms.

Google will beryllium offering 8 different virtual instrumentality configurations, ranging from 1 TPU spot to complete 250 wrong a azygous slice.

It's not each astir hardware, of course. They're focusing connected greater scalability for handling ample AI workloads successful Cloud TPU v5e pinch a characteristic called Multislice. Currently successful preview, this has been developed to let users to standard models beyond nan confines of a azygous TPU pod to encompass tens of thousands of TPU chips, if necessary. Training jobs were antecedently constricted to a azygous portion of TPU chips.

Also aimed astatine demanding AI workloads for illustration LLMs are Google's A3 virtual instrumentality instances which person 8 Nvidia H100 GPUs, dual 4th Gen Intel Xeon Scalable processors and 2TB of memory. These instances were first announced astatine Google IO backmost successful May, but are now group to beryllium disposable adjacent month, it said.

With improvements successful networking bandwidth owed to an offload web adapter and Nvidia Connective Communications Library (NCCL), Google expects nan A3 virtual machines will supply a boost for users looking to build ever much blase AI models.

Google Next besides yielded specifications astir GKE Enterprise, described arsenic a premium version of nan company's managed Google Kubernetes Engine (GKE) work for containerized workloads.

GKE Enterprise edition, to beryllium disposable successful preview from early September, sports a caller multi-cluster capacity that lets customers group akin workloads together arsenic "fleets" and use civilization configurations and argumentation guardrails crossed nan fleet, Google said.

  • US Republican party's spam select suit against Google dimissed
  • Microsoft still prohibits Google aliases Alibaba from moving O365 Windows Apps
  • Big Tech pumps $235M into AI exemplary depot Hugging Face
  • OpenAI snaps up role-playing crippled dev arsenic first acquisition

This version comes pinch managed information features including workload vulnerability insights, governance and argumentation controls, positive a managed work mesh. With capabilities drawn from Google's Anthos platform, nan institution claims that GKE Enterprise version tin span hybrid and multi-cloud scenarios to fto users tally instrumentality workloads connected different nationalist clouds and on-premises arsenic good arsenic connected GKE.

In addition, GKE itself now supports some Cloud TPU v5e and nan A3 virtual instrumentality instances pinch H100 GPUs for demanding AI workloads, Google said.

Also continuing nan AI theme, Google is bringing additions to its Google Distributed Cloud (GDC) offering, positive updated hardware to support nan on-prem hold to its unreality platform.

The 3 caller AI and information offerings are Vertex AI integrations, AlloyDB Omni, and Dataproc Spark. The Vertex integrations bring Vertex Prediction and Vertex Pipelines to GDC Hosted, though these will only beryllium disposable successful preview from Q2 2024.

AlloyDB Omni is simply a caller managed database engine, claimed to connection doubly nan velocity of PostgreSQL for transactional workloads, and presently disposable successful preview.

Dataproc Spark is simply a managed work for moving analytics workloads nether Apache Spark, claimed to connection users little costs than deploying Spark themselves. It will beryllium disposable successful preview from Q4.

Finally, Google said it is introducing an updated hardware stack for GDC, featuring 4th Gen Intel Xeon Scalable processors and higher capacity web fabrics pinch up to 400Gbps throughput.

And also, Ampere's AmpereOne Arm-compatible processors will astatine immoderate constituent beryllium disposable arsenic a private preview successful nan shape of a C3A compute instance. ®