Skip to main content

AI development

Welcome to our comprehensive guide on building with generative AI! Before we get our hands dirty, let's understand the two core concepts we'll be working with: Large Language Models (LLMs) and APIs. An LLM is a powerful AI model trained on a massive amount of text and code. This training allows it to understand, generate, and process human language in a remarkable way. Think of it as a super-intelligent digital brain for text. An API (Application Programming Interface) is a set of rules that allows different applications to talk to each other. In our case, the API is the bridge that lets your code send a request (like a prompt) to an LLM and receive a response (like generated text). Using an API means you don't need to host or manage the massive LLM on your own computer.

How to Build with Generative AI

Part 1: Getting Started with Large Language Models

DION: The Distributed Orthonormal Update Revolution is Here

Anthropic's Claude AI can now process entire software projects in a single request, marking a significant leap in its capabilities.

This tutorial concludes by showing how to build specialized AI assistants, or "Gems," using prompt engineering with a general-purpose model like Gemini. Gems are custom AI models excelling at specific tasks, exhibiting domain expertise, specialized behavior, and consistent persona. This is achieved through a detailed system prompt, acting as the AI's "DNA," defining its persona, goals, constraints, and formatting. The prompt includes the AI's role (e.g., financial analyst), its function, limitations (e.g., avoiding giving investment advice), and desired response structure. A "Financial Analyst Gem" example demonstrates how a well-crafted system prompt transforms a general chatbot into a specialized, focused assistant. Mastering prompt engineering unlocks the Gemini API's flexibility, allowing creation of diverse AI assistants.
This tutorial demonstrates using Google's Gemini API for creative content generation, specifically image creation and animation. It focuses on using the Imagen model for text-to-image generation, emphasizing the importance of detailed, descriptive prompts including subject, style, setting, and mood. While a dedicated video generation API isn't yet available, the concept is explained using the Veo model, suggesting a workaround using sequential image generation. A practical example shows building a simple React app that utilizes the `imagen-3.0-generate-002` model to generate images from user-provided text prompts, including error handling and loading states.
This chapter explores using Gemini as a coding assistant. It covers code generation, debugging, and optimization. For code generation, clear prompts specifying the goal, language/framework, and context are crucial. Debugging prompts should include the code, problem description, and any error messages. Optimization prompts require the code and the optimization goal. Gemini can generate code in various languages, identify and suggest fixes for bugs, and improve code performance, streamlining the coding workflow for developers of all skill levels. Examples using Python and Javascript demonstrate these capabilities.
Welcome to the final section of our tutorial series! So far, you've mastered the fundamentals of the Gemini API, from your first API call to building sophisticated applications that handle stateful conversations and perform deep research. You now have a strong foundation in a wide range of Gemini's core capabilities.
This guide explains how to use the Gemini API for in-depth research, even without a dedicated "Deep Research" feature. It emphasizes crafting sophisticated prompts to achieve this. "Deep research" involves synthesizing information from multiple sources, performing structured analysis, and generating comprehensive reports. The guide provides a prompt-engineering template: defining the AI's persona and goal, specifying the topic and scope, listing required report sections, adding formatting requirements, and setting constraints. A well-structured prompt acts as a research plan for Gemini, resulting in thorough analysis. The guide concludes with a brief mention of a sample React application showcasing this process.
This tutorial demonstrates building a stateful chat application using React and the Gemini API. It leverages React's state management to maintain conversation history (`messages`, `input`, `isLoading`), automatically scrolling to new messages using `useRef` and `useEffect`. The core functionality lies in `callGeminiAPI`, which sends the entire conversation history to the Gemini API for context-aware responses, incorporating exponential backoff for error handling. The UI, built with JSX and Tailwind CSS, displays messages differently based on sender (user/model) and includes a simple input form. The complete code is provided for a functional application.