LLM (Large Language Model) is a general term for neural network models pre-trained on massive amounts of text data, containing billions to trillions of parameters, capable of understanding and generating natural language with high accuracy.
Since the public release of ChatGPT in November 2022, the term LLM has spread not only among engineers but to the general public as well. However, the essence conveyed by the name "large language model" is simple: it is nothing more than "a model trained by feeding it large amounts of text and repeatedly having it predict the next word." What makes LLMs fascinating is that this straightforward learning objective gives rise to a diverse range of emergent capabilities—translation, summarization, code generation, reasoning, and more—while at the same time, theoretical understanding has yet to fully catch up. To give a concrete sense of scale: GPT-3 has 175 billion parameters (2020), Llama 3 has 70 billion parameters (2024), and GPT-4, though undisclosed, is estimated to exceed 1 trillion. While there is a general tendency for models to become more capable as parameter count increases, the fact that Llama 3 70B outperforms GPT-3 175B on many benchmarks demonstrates that the quality of training data and improvements in architecture are equally, if not more, important. There are three main routes for using LLMs in practice. The first is **via API**. This involves calling models from OpenAI or Anthropic directly. It is the most straightforward approach, but the challenges lie in the fact that data is sent to external parties and in managing costs under a pay-as-you-go pricing model. The second is **combining with RAG**. By retrieving internal documents and passing them to the LLM, this approach leverages internal knowledge while suppressing hallucinations (outputs that differ from the facts). Since the model itself is not modified, the barrier to adoption is low. The third is **fine-tuning**. This involves adjusting the model's behavior using proprietary data. It is effective when consistency in response tone or accurate use of industry-specific terminology is required, but it necessitates preparing training data and incurring GPU costs. Which route to choose depends on "what problem you are trying to solve," and cases where all three are used in combination are becoming increasingly common.


A2A (Agent-to-Agent Protocol) is a communication protocol that enables different AI agents to perform capability discovery, task delegation, and state synchronization, published by Google in April 2025.

Agent Skills are reusable instruction sets defined to enable AI agents to perform specific tasks or areas of expertise, functioning as modular units that extend the capabilities of an agent.

Agentic AI is a general term for AI systems that interpret goals and autonomously repeat the cycle of planning, executing, and verifying actions without requiring step-by-step human instruction.
