no-js web accessible help

Started by kdmoyers, April 29, 2026, 02:12:23 PM

Previous topic - Next topic

kdmoyers

The crashing weight of the llm revolution/tragedy has come to my company too, and it is a shame that the llms seem to have trouble accessing help info for winbatch. 

If you start at https://docs.winbatch.com/ you get the cool left side bar interface, which seems to baffle the llms.  I don't think they do the cool interface thing.

If you start at https://docs.winbatch.com/Contents.htm you can't actually penetrate to the content. there just don't seem to be any links inward.

I've seen https://docs.winbatch.com/mergedProjects/WindowsInterfaceLanguage/html/HTMLWIL_WIL001.htm
mentioned -- is this the best no-js entry point to recommend to the llms?

This icky future is here, and I hate to see us drift away from winbatch simply because the llms don't know it.  It doesn't matter to me, I already know it, but the junior programmers are lazy...

thanks in advance,
Kirby
The mind is everything; What you think, you become.

td

Are you sure that JavaScript is the problem? It could just be the delay before the context sidebar is populated.

The big boys scan this Website and the Techsupport Website constantly, and they seem to grasp the WinBatch basics. I have been having some fun developing my own machine learning system, but have not tackled training the model with WinBatch scripts and documentation yet. Probability and information theory are enough to try to wrap my head around for now...
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

LLMs aren't coding because they are reading the manual. They're coding because they have examples. Until there is a tonne of well coded and commented on examples of winbatch, LLMs won't be able to help us.

td

Generally, training machine learning models involves both ingesting documentation and examples. It can be very tedious because coding examples should have an associated prompt. At the very least, the examples should contain quality code comments.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

It's true for the instruction-tuning / supervised fine-tuning stage.

However, that paired data is not where most of an LLM's coding ability comes from. That's built during pre-training on the massive self-supervised phase where the model simply predicts the next token across trillions of tokens of raw internet sourced data from GitHub repos, Stack Overflow threads, notebooks, etc.

td

Perhaps read my previous post a little more carefully. Also, most programming language documentation contains examples. That is true of WinBatch documentation. And models can be adjusted once created with settings like top-t, top-p, tau, eta, mu, etc. These settings adjust the accuracy and performance. Fun math.

Anthropics Claude AI Chatbot will tell you that training for model updates comes from "publicly indexed places (forums, GitHub, official docs)."
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

I did read your post carefully.

The vast majority of an LLM's coding ability does not come from the handful of paired prompt+example snippets you see in official docs or during supervised fine-tuning. That stage is small.

What actually teaches the model how to code is the pre-training phase — next-token prediction on trillions of raw tokens scraped from the entire internet: GitHub repos, Stack Overflow, notebooks, forums, etc. That's where the model learns patterns, idioms, error handling, best practices, etc. for popular languages.

WinBatch is extremely obscure.

  • It has almost zero presence on GitHub (the "winbatch" topic barely exists and has virtually no stars or activity).
  • Stack Overflow has a handful of ancient questions.
  • The official WinBatch site and tech support database have maybe a couple thousand examples total.

That is a rounding error compared to the scale of pre-training data. A few dozen (or even a few thousand) examples in the docs are nowhere near enough for the model to actually learn the language the way it learned Python, JavaScript, or even PowerShell.

The sampling settings you mentioned (top-p, temperature, top-k, etc.) are inference knobs — they only change how the already-trained model picks the next token. They don't add new knowledge.

And yes, Claude is correct that training data comes from "publicly indexed places." The problem is WinBatch simply isn't there in any meaningful quantity. That's exactly why LLMs suck at it.

So no — the official docs with their limited examples are not going to magically give the model WinBatch superpowers. We'd need orders of magnitude more well-commented, real-world WinBatch code in the wild for that to happen.

spl

Very interesting discussion. But, you know me... I go on tangents. Training LLM to wrote code {correctly} can be a holy grail, but I have become interested in Local LLM to digest schema information from a database [a post with code I made a couple of months ago for making schemas]... and then ask as a 'bot' to answer questions about the data in natural language. I am using "llama3.1" as the model, and the requests are from a local api - "http://localhost:11434/api/generate" - all done via HTTP request. The request could be made from WB via a CLR System.Net.HttpWebRequest, or possibly the COM WinHttp. The issue is streaming the result content so now using PS and the ConvertFrom-Json cmdlet. But as both Tony and BP noted - lot of behind the scenes work required. Designing and training a db to respond to natural questions like "Who had the least leads from our ads last week and why do you think that is?" without an analyst crunching data in Excel... scary, but not difficult.
Stan - formerly stanl [ex-Pundit]

td

Quote from: bottomleypotts on May 01, 2026, 09:20:56 PMI did read your post carefully.

The vast majority of an LLM's coding ability does not come from the handful of paired prompt+example snippets you see in official docs or during supervised fine-tuning. That stage is small.

What actually teaches the model how to code is the pre-training phase — next-token prediction on trillions of raw tokens scraped from the entire internet: GitHub repos, Stack Overflow, notebooks, forums, etc. That's where the model learns patterns, idioms, error handling, best practices, etc. for popular languages.

WinBatch is extremely obscure.

  • It has almost zero presence on GitHub (the "winbatch" topic barely exists and has virtually no stars or activity).
  • Stack Overflow has a handful of ancient questions.
  • The official WinBatch site and tech support database have maybe a couple thousand examples total.

That is a rounding error compared to the scale of pre-training data. A few dozen (or even a few thousand) examples in the docs are nowhere near enough for the model to actually learn the language the way it learned Python, JavaScript, or even PowerShell.

The sampling settings you mentioned (top-p, temperature, top-k, etc.) are inference knobs — they only change how the already-trained model picks the next token. They don't add new knowledge.

And yes, Claude is correct that training data comes from "publicly indexed places." The problem is WinBatch simply isn't there in any meaningful quantity. That's exactly why LLMs suck at it.

So no — the official docs with their limited examples are not going to magically give the model WinBatch superpowers. We'd need orders of magnitude more well-commented, real-world WinBatch code in the wild for that to happen.

Me thinks you claim too much and too little at the same time.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

kdmoyers

The mind is everything; What you think, you become.

kdmoyers

One more question: is there a link I could point the llms to that exposes the tech database in an llm friendly format?   How would this do?

https://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/nftechsupt.web+WinBatch

But maybe I should not suggest the link, because so much of the code in there is kinda old? 
Not sure.  What do you think?
The mind is everything; What you think, you become.

td

Since you ask, I don't think the age of the examples matters much as long as the syntax is correct. I am not sure what the best approach to the tech database is. I would need to get back to you on that. I do know that several bots regularly navigate the tech database on their own. For example, Yandex, OpenAI, and Amazon bots are navigating as I write this. How deeply the bots are traversing the site is a question I haven't checked into yet.

Since WIL has a Lua-like syntax, it shouldn't take too many examples to train a model on basic syntax. For example, Anthropic's chatbot suggests that 500-1000 good examples are sufficient and better than 100,000 random ones. Reportedly, synthetic data generation is commonly used to train models on programming languages. Basically, AI trains AI. This would need techbro by/in to accomplish, and what could possibly go wrong? <grin>

My views on machine learning have changed a bit. I am still very sceptical, but my youngest has shown me what task specific models can do in computational biology, genetics, immunology, and several other fields. 

I learn by doing, so I have an AI-related side project made up of many smaller tasks. The results have almost no probability of seeing the light of day, but I might learn something in the doing.

And I apologize for wandering off topic again.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

Quote from: td on May 03, 2026, 11:47:24 PMMe thinks you claim too much and too little at the same time.

I'm not sure what you're saying here. But good chat!

Quote from: td on May 04, 2026, 09:22:19 AMBasically, AI trains AI.

The big one everyone talks about. But when models train recursively on their own (or prior gen) outputs, they lose diversity, amplify subtle errors, and get narrower/confident-but-wrong. This has been documented since the 2023-2024 papers and is still a live concern in 2026 discussions. Good in theory.

Quote from: td on May 04, 2026, 09:22:19 AMMy views on machine learning have changed a bit. I am still very sceptical, but my youngest has shown me what task specific models can do in computational biology, genetics, immunology, and several other fields.

Biology has clearer "ground truth" + faster, cheaper validation loops. That's why were making progress in this field.

Coding is open-ended engineering, not prediction. WinBatch is the poster child for the "long tail" problem in AI coding. Almost zero public training data. There is no verification loop payoff!

td

Quote from: kdmoyers on May 04, 2026, 07:49:02 AMOne more question: is there a link I could point the llms to that exposes the tech database in an llm friendly format?   How would this do?

https://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/nftechsupt.web+WinBatch

But maybe I should not suggest the link, because so much of the code in there is kinda old? 
Not sure.  What do you think?

The ChatGPT bot is crawling articles this morning. The bot is harvesting each topic without issue. I also need to mention that the Website throttles bots a bit because of CPU usage spikes. The big boys know better, but the Python based "kiddy scrappers" not so much.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

kdmoyers

Thanks Tony!!  We're all new at this stuff.
The mind is everything; What you think, you become.

td

Quote from: bottomleypotts on Today at 12:23:04 AMI'm not sure what you're saying here. But good chat!

Maybe an AI model will explain it.

Quote from: bottomleypotts on Today at 12:23:04 AMThe big one everyone talks about. But when models train recursively on their own (or prior gen) outputs, they lose diversity, amplify subtle errors, and get narrower/confident-but-wrong. This has been documented since the 2023-2024 papers and is still a live concern in 2026 discussions. Good in theory.

Like I said, "what could possibly go wrong?" We are familiar with the chatter around the topic.


Quote from: bottomleypotts on Today at 12:23:04 AMBiology has clearer "ground truth" + faster, cheaper validation loops. That's why were making progress in this field.


Quote from: bottomleypotts on Today at 12:23:04 AMCoding is open-ended engineering, not prediction. WinBatch is the poster child for the "long tail" problem in AI coding. Almost zero public training data. There is no verification loop payoff!

As my youngest, who works in both fields, might tell you, coding is more deterministic than the biological sciences.

I have a hypothesis concerning the coding part that I will be testing when time permits.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

Quote from: kdmoyers on Today at 09:11:30 AMThanks Tony!!  We're all new at this stuff.

That's the fun part. An opportunity to experience the awe and aha moments of learning.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

SMF spam blocked by CleanTalk