Document Generator
ydata.synthesizers.text.model.document.DocumentFormat
Bases: Enum
Enum representing supported output formats for synthetic document generation.
Attributes:
Name | Type | Description |
---|---|---|
DOCX |
Microsoft Word document format (docx) |
|
PDF |
Portable Document Format (pdf) |
|
HTML |
HyperText Markup Language format (html) |
ydata.synthesizers.text.model.document.DocumentGenerator
Bases: BaseGenerator
A class for generating synthetic documents in various formats (DOCX, PDF, HTML) based on input specifications.
Features
- Support for multiple document formats (DOCX, PDF, HTML)
- Configurable LLM selection
- Template-based document generation
- Customizable document structure and styling
- Batch processing of multiple document specifications
Parameters:
Name | Type | Description | Default |
---|---|---|---|
api_key
|
str
|
API key for the LLM provider |
required |
provider
|
Union[LLMProvider, str]
|
The LLM provider to use |
OPENAI
|
model_name
|
Optional[Union[OpenAIModel, AnthropicModel, str]]
|
Specific model to use |
GPT_4
|
default_format
|
DocumentFormat
|
Default output format if not specified in request |
required |
generate(document_type=None, n_docs=1, audience=None, tone=None, purpose=None, region=None, language=None, length=None, topics=None, style_guide=None, output_dir=None, **kwargs)
Generate documents based on input specifications.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
document_type
|
str
|
Type of document to generate |
None
|
audience
|
str
|
Target audience for the document |
None
|
tone
|
str
|
Desired tone of the document. Can be selected from the following limited list of values formal, casual, persuasive, empathetic, inspirational, enthusiastic, humorous, neutral. |
None
|
purpose
|
str
|
Purpose of the document |
None
|
region
|
str
|
Target region/locale |
None
|
language
|
str
|
Language of the document |
None
|
length
|
str
|
Desired length of the document |
None
|
topics
|
str
|
Key points to cover |
None
|
style_guide
|
str
|
Style guide to follow |
None
|
output_dir
|
str
|
Directory to store generated documents |
None
|
**kwargs
|
Additional arguments to pass to the generation process |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If input validation fails or document format is unsupported |