Learning Language Model from Scratch Code and Parameters
So I took Sebastian's large language model code as is. Note, copyright to him. But I was curious what are parameters? What are parameters? See my previous post but... # This is a large model, reducing #LLAMA2_CONFIG_7B = { # "vocab_size": 32000, # Vocabulary size # "context_length": 4096, # Context length # "emb_dim": 4096, # Embedding dimension # "n_heads": 32, # Number of attention heads # "n_layers": 32, # Number of layers # "hidden_dim": 11008, # NEW: Size of the intermediate dimension in FeedForward # "dtype": torch.bfloat16 # NEW: Lower-precision dtype to save memory #} # Previous run # huggingface_hub version: 0.26.1 # sentencepiece version: 0.2.0 # torch version: 2.4.1 # Total number of parameters: 6,738,415,616 # float32 (PyTorch default): 52.33 GB # bfloat16: 26.17 GB LLAMA2_CONFIG_7B = { "vocab_size" : 32000 , # Keeping ...