Document Generator
The Document Generator allows you to create synthetic documents in various formats (PDF, DOCX, HTML) with customizable parameters. This is particularly useful for generating test data, creating templates, or producing sample documents for training purposes.
- Generate single or multiple documents
- Customize document type, audience, and tone
- Support for multiple output formats (PDF, DOCX, HTML)
- Control over document length and style
- Regional and language customization
Limited values for documents input parameter tone
The tone
input parameter must receive a value that exists within the following list: formal, casual, persuasive, empathetic, inspirational, enthusiastic, humorous, neutral
Don't forget to set up your license key
Example Code
"""
Document Generator Example
"""
import os
from ydata.synthesizers.text.model.document import DocumentGenerator, DocumentFormat
if __name__ == "__main__":
# Step 1: Authenticate with ydata-sdk
os.environ['YDATA_LICENSE_KEY'] = 'add-sdk-key' # Replace with your license key
# Step 2: Initialize the DocumentGenerator with desired format
print("Initializing Document Generator...")
generator = DocumentGenerator(
document_format=DocumentFormat.PDF # Set the document output format (PDF, DOCX, or HTML)
)
# Step 3: Generate a single document
# Note: The tone parameter accepts one of the following values: [formal, casual, persuasive, empathetic, inspirational, enthusiastic, humorous, neutral]
print("\n=== Generating Single Invoice Document ===")
generator.generate(
n_docs=1, # Generate one document
document_type="Invoice", # Type of document to generate
audience="Corporate client", # Target audience
tone="professional", # Writing tone
purpose="Issue a detailed invoice for services rendered. Please provided detailed examples and real line items", # Document purpose
region="North America", # Regional context
language="English", # Output language
length="Long", # Document length (invoices are usually not long)
topics="Consulting services, Hourly rates, Tax breakdown, Payment terms",
# Key topics as a single comma-separated string
style_guide="Professional design for a financial institution", # Style or branding requirements
output_dir="output/documents", # Output directory
)
print("\n=== Generating Single Invoice (Supermarket) Document ===")
generator.generate(
n_docs=1, # Generate one document
document_type="Invoice", # Still an invoice
audience="Retail customer", # Target audience is a consumer
tone="professional", # Still professional but consumer-friendly
purpose="Detailed supermarket invoice with grocery and household items purchases.",
# Purpose tailored to retail
region="North America", # Regional context
language="English", # Output language
length="Long", # Allows for many line items
topics="Groceries, Household goods, Unit price, Quantity, Subtotals, Tax, Total due, Payment method",
# Supermarket-specific topics
style_guide="Clean and readable receipt-style format typical of supermarket invoices",
# Style expectation for consumer retail
output_dir="output/documents", # Output directory
)
# Step 4: Generate multiple documents with the same parameters
print("\n=== Generating Multiple Documents ===")
generator.generate(
n_docs=5, # Generate 5 documents with the same parameters
document_type="Report",
audience="Technical",
tone="neutral", # Writing tone (must be one of the predefined values)
purpose="Technical documentation",
region="Global",
language="English",
length="Medium",
topics="API documentation, code examples, best practices",
style_guide="Clear and concise",
output_dir="output/documents",
)