SON DAKİKA

Donanım

Microsoft Araştırmacıları, 2 milyar parametreli 1-bit AI LLM Geliştirdi – Bazı CPU’larda Çalıştırılabilecek Kadar Küçük Model

sm:leading-[6px] sm:text-sm”>
You may like

  • Microsoft Snapdragon X Copilot+ PCs get local DeepSeek-R1 support — Intel, AMD in the works

  • Moore Threads GPUs allegedly show ‘excellent’ inference performance with DeepSeek models

Swipe to scroll horizontally

Benchmark

BitNet b1.58 2B

LLaMa 3.2 1B

Gemma 3 1B

Qwen 2.5 1.5B

Non-embedding memory usage

0.4 GB

2 GB

1.4 GB

2.6 GB

Latency (CPU Decoding)

29ms

48ms

41ms

65ms

Training tokens

4 trillion

9 trillion

2 trillion

18 trillion

ARC-Challenge

49.91

37.80

38.40

46.67

ARC-Easy

74.79

63.17

63.13

76.01

OpenbookQA

41.60

34.80

38.80

40.80

BoolQ

80.18

64.65

74.22

78.04

HellaSwag

68.44

60.80

57.69

68.28

PIQA

77.09

74.21

71.93

76.12

WinoGrande

71.90

59.51

58.48

62.83

CommonsenseQA

71.58

58.48

42.10

76.41

TruthfulQA

45.31

43.80

38.66

46.67

TriviaQA

33.57

37.60

23.49

38.37

MMLU

53.17

45.58

39.91

60.25

HumanEval+

38.40

31.10

37.20

50.60

GSM8K

58.38

38.21

31.16

56.79

MATH-500

43.40

23.00

42.00

53.00

IFEval

53.48

62.71

66.67

50.12

MT-bench

5.85

5.43

6.40

6.12

Average

54.19

44.90

43.74

55.23

However, the LLM must use the bitnet.cpp inference framework for it to run this efficiently. The team specifically said that this model will not have the performance efficiency gains “when using it with the standard transformers library, even with the required fork.”

You will need to grab the framework available on GitHub if you want to take advantage of its benefits on lightweight hardware. The repository describes bitnet.cpp as offering “a suite of optimized kernels that support fast and lossless inference of 1.58-bit models on CPU (with NPU and GPU support coming next). While it doesn’t support AI-specific hardware at the moment, it still allows anyone with a computer to experiment with AI without requiring expensive components.

AI models are often criticized for taking too much energy to train and operate. But lightweight LLMs, such as BitNet b1.58 2B4T, could help us run AI models locally on less powerful hardware. This could reduce our dependence on massive data centers and even give people without access to the latest processors with built-in NPUs and the most powerful GPUs to use artificial intelligence.

Follow Tom’s Hardware on Google News to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button.

Düşüncenizi Paylaşın

E-posta adresiniz yayınlanmayacak. Gerekli alanlar * ile işaretlenmişlerdir

İlgili Teknoloji Haberleri