# RunPod Serverless - GPU Services ## Resumen RunPod Serverless proporciona GPUs on-demand para los servicios GRACE, PENNY y FACTORY. **Cuenta**: rpd@tzr.systems **Balance**: ~$69 USD **Cuota máxima**: 5 workers simultáneos --- ## Endpoints Activos | Servicio | Endpoint ID | Workers Max | Módulos | |----------|-------------|-------------|---------| | **GRACE** | `rfltzijgn1jno4` | 2 | ASR, OCR, TTS, Face, Embeddings, Avatar | | **PENNY** | `zsu7eah0fo7xt6` | 2 | TTS (voz) | | **FACTORY** | `hffu4q5pywjzng` | 1 | Embeddings, procesamiento | **GPUs soportadas**: RTX 3090, RTX 4090, RTX A4000, NVIDIA L4 --- ## Configuración Crítica ### Problema Resuelto (2025-12-27) Los workers serverless no iniciaban usando la API GraphQL. La solución fue usar la **REST API** con parámetros específicos. ### API Correcta: REST (no GraphQL) ```bash # CORRECTO - REST API curl -X POST "https://rest.runpod.io/v1/endpoints" \ -H "Authorization: Bearer $RUNPOD_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "name": "mi-endpoint", "templateId": "TEMPLATE_ID", "gpuTypeIds": ["NVIDIA GeForce RTX 3090", "NVIDIA GeForce RTX 4090"], "scalerType": "QUEUE_DELAY", "scalerValue": 4, "workersMin": 0, "workersMax": 2, "idleTimeout": 60, "executionTimeoutMs": 600000, "flashboot": true }' ``` **Campos obligatorios**: - `gpuTypeIds`: Array de strings (NO string separado por comas) - `scalerType`: "QUEUE_DELAY" - `scalerValue`: 4 (segundos de delay antes de escalar) - `flashboot`: true (arranque rápido) --- ## Uso de la API ### Enviar Job ```bash curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/run" \ -H "Authorization: $RUNPOD_API_KEY" \ -H "Content-Type: application/json" \ -d '{"input": {"module": "ASR_ENGINE", "audio_base64": "..."}}' ``` ### Verificar Estado ```bash curl "https://api.runpod.ai/v2/{ENDPOINT_ID}/status/{JOB_ID}" \ -H "Authorization: $RUNPOD_API_KEY" ``` ### Health Check ```bash curl "https://api.runpod.ai/v2/{ENDPOINT_ID}/health" \ -H "Authorization: $RUNPOD_API_KEY" ``` Respuesta esperada: ```json { "jobs": {"completed": 5, "failed": 0, "inQueue": 0}, "workers": {"idle": 1, "ready": 1, "running": 0} } ``` --- ## Módulos GRACE | Módulo | Descripción | Modelo | |--------|-------------|--------| | `ASR_ENGINE` | Speech-to-Text | Faster Whisper Large V3 | | `OCR_CORE` | Reconocimiento de texto | GOT-OCR 2.0 | | `TTS` | Text-to-Speech | XTTS-v2 | | `FACE_VECTOR` | Vectores faciales | InsightFace | | `EMBEDDINGS` | Embeddings de texto | BGE-Large | | `AVATAR_GEN` | Generación de avatares | SDXL | --- ## Docker Image **Registry temporal**: `ttl.sh/tzzr-grace:24h` (expira en 24h) ### Rebuild ```bash # En servidor deck (72.62.1.113) cd /tmp/docker-grace docker build -t ttl.sh/tzzr-grace:24h . docker push ttl.sh/tzzr-grace:24h ``` ### Dockerfile clave ```dockerfile FROM runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04 # Fix blinker antes de requirements RUN pip install --no-cache-dir --ignore-installed blinker RUN pip install --no-cache-dir -r requirements.txt ``` --- ## Troubleshooting ### Workers no inician (0 workers) 1. Usar REST API, no GraphQL 2. Verificar `gpuTypeIds` es array 3. Incluir `flashboot: true` 4. Verificar cuota de workers no excedida ### Job en cola indefinidamente ```bash # Purgar cola curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/purge-queue" \ -H "Authorization: $RUNPOD_API_KEY" ``` ### Error de modelo (TOS) Agregar variable de entorno: `COQUI_TOS_AGREED=1` --- ## Credenciales Ubicación: `/home/orchestrator/.secrets/runpod_api_key` ```bash export RUNPOD_API_KEY=$(cat ~/.secrets/runpod_api_key) ``` --- ## Referencias - [RunPod REST API](https://docs.runpod.io/api-reference) - [Serverless Endpoints](https://docs.runpod.io/serverless/endpoints/manage-endpoints) - [Status Page](https://uptime.runpod.io)