Open to opportunities

Valter Lopes

@valterlopes

Message

AI evaluator specializing in LLM reasoning quality, long-context coherence, and hallucination detection.

Portugal

Message

What I'm looking for

I’m looking for a remote role where I can evaluate LLM reasoning quality end-to-end—long-context coherence, hallucination detection, semantic consistency, and prompt stress testing—so teams can improve reliability, alignment, and instruction-following.

I’m an analytical, concept-oriented AI evaluator focused on evaluating LLM response reasoning quality—especially across long-context interaction. I work deeply on structured reasoning exploration, recursive prompt testing, and semantic consistency analysis (PT/EN).

In my independent, remote research (2024–Present), I conduct extensive long-context interaction and evaluation to assess reasoning behavior, semantic continuity, and abstraction handling. I produce iterative evaluation frameworks and comparative analyses, including detect contradictions and hallucinations, evaluate instruction-following precision, and test edge-case prompts.

I’m particularly strong at identifying subtle logical inconsistencies, ambiguity drift, narrativе instability, and alignment weaknesses across extended AI conversations. I focus on ranking quality, verifying symbolic and conceptual consistency, and stress-testing long-context coherence under challenging conceptual loads.

Experience

Work history, roles, and key accomplishments

Current

AI Evaluation & LLM Reasoning

Current

Independent

Jan 2024 - Present (2 years 6 months)

Conducted structured AI reasoning and long-context evaluation focused on semantic continuity, hallucination detection, and recursive prompt stress-testing. Produced iterative evaluation frameworks and comparative analyses of LLM responses under abstraction-heavy conceptual loads.

LLM Evaluation Prompt Engineering Hallucination Detection Semantic Consistency Testing

Education

Degrees, certifications, and relevant coursework

Valter Lopes Caldas da Rainha

AI Evaluation & LLM Reasoning Analysis

2024 -

Activities and societies: Long-context conversational evaluation; recursive prompt testing; semantic drift/hallucination detection; instruction-following checks; comparative model evaluation; PT/EN interaction; ChatGPT/LLM workflow testing.

Independent remote AI evaluation and LLM reasoning analysis focused on long-context coherence, semantic consistency, hallucination detection, and recursive prompt stress testing. Produces comparative response rankings and evaluation frameworks to assess reasoning quality and alignment weaknesses across extended PT/EN conversations.