Build Your Own ‘Ask Gemini’ App in C# with Vertex AI

Uncategorized

Ever wanted to build your own custom interface to Google’s powerful Gemini models, directly within your .NET applications? Imagine creating a service that can analyze documents, describe images, and answer complex questions, all through a clean, reusable C# class you control. It’s more achievable than you might think.

This article will show you exactly how to do that.
We’re not just going to talk about theory; we’re going to build a complete, reusable C# client that acts as your personal “Ask Gemini” service, powered by the scalability and security of Google Cloud’s Vertex AI.

We’ll break down every piece of the code, explaining the “why” behind each decision so you can truly master the integration.

Prerequisites

Before we start building, make sure you have the following set up:

  1. A Google Cloud Project: Create a project in the Google Cloud Console.
  2. Enable the Vertex AI API: In your project, navigate to the API Library and enable the “Vertex AI API”.
  3. Service Account Credentials: Your application needs a secure identity. Create a service account, grant it the “Vertex AI User” role (roles/aiplatform.user), and download its JSON key file. This is the key to your app’s access.
  4. .NET Environment: A recent version of the .NET SDK (e.g., .NET 6, .NET 8, or later).
  5. Required NuGet Packages: Install these libraries into your C# project:
    • Google.Cloud.AIPlatform.V1
    • Grpc.Net.Client
    • Google.Apis.Auth
    • Markdig (for beautifully formatting the AI’s response)

The GeminiApi Class: Your Gateway to the AI

Our goal is to create a robust service that handles all the complexities of communicating with Gemini. Let’s dive into the code.

// The complete code for your "Ask Gemini" client
using System.Net;
using Google.Cloud.AIPlatform.V1;
using Google.Apis.Auth.OAuth2;
using Google.Protobuf;
using Grpc.Auth;
using Grpc.Net.Client;
using Markdig;

namespace YourProject.Gemini
{
    public class GeminiApi
    {
        private readonly PredictionServiceClient _predictionServiceClient;
        private readonly EndpointName _endpointName;

        // ... Constructor and Methods will be explained below ...
    }
}

1. The Constructor: Establishing a Secure Connection

The constructor is where the magic begins. It handles authentication and establishes a high-performance, secure connection to the Vertex AI endpoint.

public GeminiApi(string projectId, string location, string jsonCredentialsPath, string? model = null, string? proxyUrl = null)
{
    if (model == null)
        model = "gemini-1.5-pro";

    // 1. Authenticate using the service account JSON file
    var credential = GoogleCredential.FromFile(jsonCredentialsPath)
        .CreateScoped(PredictionServiceClient.DefaultScopes);
    var channelCredentials = credential.ToChannelCredentials();

    // 2. Handle optional proxy configuration
    HttpClientHandler handler;
    if (proxyUrl == null)
    {
        handler = new HttpClientHandler();
    }
    else
    {
        var proxy = new WebProxy(proxyUrl);
        handler = new HttpClientHandler { Proxy = proxy, UseProxy = true };
    }

    // 3. Configure the gRPC channel
    var clientEndpoint = $"{location}-aiplatform.googleapis.com";
    var channelOptions = new GrpcChannelOptions { HttpHandler = handler, Credentials = channelCredentials };
    var channel = GrpcChannel.ForAddress($"https://{clientEndpoint}", channelOptions);

    // 4. Build the Prediction Service Client
    _predictionServiceClient = new PredictionServiceClientBuilder
    {
        CallInvoker = channel.CreateCallInvoker()
    }.Build();

    // 5. Define the specific model endpoint
    _endpointName = EndpointName.FromProjectLocationPublisherModel(projectId, location, "google", model);
}

Key Points (Explained in Detail):

  • Authentication with Service Accounts: We use GoogleCredential.FromFile(). Why? Server-side applications need a secure identity. A Service Account is a non-human identity for your code. Using its JSON key file is the standard practice because it allows your application to authenticate itself without hardcoding secrets like passwords or API keys directly into the source code. The .CreateScoped() method is crucial for security. It adheres to the Principle of Least Privilege, generating a temporary OAuth token that is only valid for the scopes specified. This means even if the token were intercepted, it could only be used to call the Vertex AI prediction service, nothing else.
  • Robust Proxy Support: The code explicitly checks for a proxyUrl. Why? In real-world enterprise environments, direct internet access from servers is often forbidden. All outbound traffic must be routed through a verified HTTP proxy. By building this logic into our class, we make it immediately usable in corporate networks, demonstrating a design that considers real-world deployment challenges.
  • gRPC for Performance: The connection is made using GrpcChannel. Why not a standard REST call? Google’s Cloud APIs are heavily built on gRPC, a high-performance Remote Procedure Call framework. Unlike traditional JSON-over-HTTP, gRPC uses HTTP/2 and binary data serialization. This results in lower latency and a smaller network footprint—a significant advantage when you’re sending large payloads (like high-resolution images) to the AI model.
  • Client Abstraction: We create a PredictionServiceClient. What is its role? Think of the GrpcChannel as the secure “telephone line” to Google. The PredictionServiceClient is the “specialized phone” that knows the exact protocol to communicate with the Vertex AI service. It’s a strongly-typed client that gives you compile-time checking and IntelliSense for methods like GenerateContentAsync, drastically reducing development errors.
  • Error-Proof Endpoint Naming: We use EndpointName.FromProjectLocationPublisherModel(...). Why not just build the string? An endpoint string is long and specific (e.g., projects/my-project/locations/us-central1/...). The EndpointName helper class from the SDK constructs this string correctly for you, making the code more readable and preventing runtime errors caused by a simple typo.

2. The Evaluate Method: Sending Your Questions

This is the core method of our “Ask Gemini” service. It’s designed to be versatile, supporting both simple text questions and complex multimodal requests that include files.

public async Task<string> Evaluate(string prompt, string? filePathToUpload = null)
{
    var promptPart = new Part { Text = prompt };
    var content = new Content { Role = "USER" };
    
    if (filePathToUpload != null)
    {
        // ... file handling logic ...
        var fileBytes = await File.ReadAllBytesAsync(filePathToUpload);
        var mimeType = GetMimeType(filePathToUpload);
        var filePart = new Part { InlineData = new Google.Cloud.AIPlatform.V1.Blob { MimeType = mimeType, Data = ByteString.CopyFrom(fileBytes) } };

        content.Parts.Add(promptPart);
        content.Parts.Add(filePart);
    }
    else
    {
        content.Parts.Add(promptPart);
    }

    var generateContentRequest = new GenerateContentRequest
    {
        Model = _endpointName.ToString(),
        Contents = { content }
    };

    GenerateContentResponse response = await _predictionServiceClient.GenerateContentAsync(generateContentRequest);

    string responseText = response.Candidates?.FirstOrDefault()?.Content?.Parts?.FirstOrDefault()?.Text ?? string.Empty;

    string htmlResponse = Markdown.ToHtml(responseText);
    return htmlResponse;
}

private string GetMimeType(string fileName)
{
    // A simple helper to determine MIME type from file extension
    string extension = Path.GetExtension(fileName).ToLowerInvariant();
    switch (extension)
    {
        case ".txt": return "text/plain";
        case ".pdf": return "application/pdf";
        case ".png": return "image/png";
        case ".jpg":
        case ".jpeg": return "image/jpeg";
        default: return "application/octet-stream";
    }
}

Key Points (Explained in Detail):

  • Structured Multimodality: The API expects a list of Part objects. Why this structure? Gemini is natively multimodal. The Part structure is the mechanism for this. One Part can hold text, while another can hold file data. This allows you to ask questions that relate the two, like, “Based on this chart image, summarize the key trends?” We send the file’s raw bytes (ByteString.CopyFrom(fileBytes)) along with its MimeType so the model knows precisely how to interpret the data.
  • Designed for Conversation: The request contains a list of Content objects. Why a list? This structure is designed to support multi-turn conversations. While our simple client sends one “USER” turn, you could extend it to pass the entire chat history in the Contents list, allowing Gemini to understand follow-up questions.
  • Asynchronous for Responsiveness: The method is async and uses await. Why is this essential? An AI call can take a few seconds. If it were synchronous (blocking), it would freeze your application’s UI or block a web server thread. Using async/await is the foundation of modern, scalable C# applications, ensuring your app stays responsive while waiting for the AI to think.
  • Defensive Response Parsing: The code uses null-conditional operators (?.) to extract the text: response.Candidates?.FirstOrDefault()?.Content?.Parts?.FirstOrDefault()?.Text. Why is this so important? An API response is not guaranteed. If a prompt violates a safety policy, the Candidates list might be empty. Without the ?. operators, your code would crash with a NullReferenceException. This chained “safe navigation” provides a robust way to get the response, gracefully returning an empty string if anything is missing.
  • User-Centric Output Formatting: The final step converts the response from Markdown to HTML using Markdig. Why add this step? Gemini models use Markdown to structure responses (headings, lists, code blocks). Converting this to HTML means you can render the output beautifully in any web-based view or UI component, dramatically improving the user experience.

Putting Your GeminiApi to Work

Using your new class is now incredibly straightforward. Here’s a simple console app example:

using YourProject.Gemini; // Your namespace

class Program
{
    static async Task Main(string[] args)
    {
        string projectId = "your-gcp-project-id";
        string location = "us-central1"; // Or your preferred location
        string credentialsPath = "path/to/your/credentials.json";

        // 1. Create an instance of your client
        var geminiApi = new GeminiApi(projectId, location, credentialsPath);

        // 2. Ask a question!
        Console.WriteLine("--- Asking Gemini a question... ---");
        string prompt = "Explain the difference between a class and an object in C# as if you were teaching a beginner.";
        string responseHtml = await geminiApi.Evaluate(prompt);
        
        // 3. Display the formatted response
        Console.WriteLine(responseHtml); 
    }
}

Conclusion

You’ve now done more than just connect to an API; you’ve built a robust and reusable foundation for any .NET application that needs to leverage generative AI. This GeminiApi class is your gateway, handling the complex parts of authentication, networking, and data formatting so you can focus on creativity. Your “Ask Gemini” client is the starting point. You can now integrate it into web APIs, desktop apps, or background services to bring next-level intelligence to your projects.