Welcome, Guest. Please login or register.

Login with username, password and session length

 
Advanced search

1411471 Posts in 69369 Topics- by 58423 Members - Latest Member: antkind

April 23, 2024, 10:30:55 AM

Need hosting? Check out Digital Ocean
(more details in this thread)
TIGSource ForumsDeveloperTechnical (Moderator: ThemsAllTook)Procedural resource dump
Pages: 1 ... 25 26 [27] 28 29 30
Print
Author Topic: Procedural resource dump  (Read 139001 times)
[email protected]
Guest
« Reply #520 on: March 10, 2023, 09:08:37 AM »

Retraining one model on previously trained models is a common tactic in ML, if I understand correctly. I have a friend who had to train ML detectors for interpreting acoustic data and used models that were previously trained on visual input.
Is that the same as layering new data over an existing model? A lot of stuff I see in the LLM world is trained on the same basic dataset, and when they make a new one they just train it from scratch. The ideal would be to retain the structure while bleaching out the little cells, to use an analogy that I'm probably conveying very poorly.
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #521 on: March 10, 2023, 02:05:03 PM »

Retraining one model on previously trained models is a common tactic in ML, if I understand correctly. I have a friend who had to train ML detectors for interpreting acoustic data and used models that were previously trained on visual input.
Is that the same as layering new data over an existing model? A lot of stuff I see in the LLM world is trained on the same basic dataset, and when they make a new one they just train it from scratch. The ideal would be to retain the structure while bleaching out the little cells, to use an analogy that I'm probably conveying very poorly.

That'snot how it's done:
- there is a fondation model trained on everything
- then they FINE TUNE with a smaller data set around a task, NOT FROM SCRATCH
They basically do what you just proposed.

Recent innovation have figure out a faster way to train on smaller data set, assuming you still use the foundation model. These are called CONTROL NET ( a kind of hyper networks, wtach the video about dreambooth and lora I shared previously), basically you have a side network that manipulate the weight of the fondation model, such that you don't have to fine tune it, which would takes longer because that mean changing all the weights, now you do't change the weight you use an add on to control the weight from outside, and that network is smaller and learn faster.

I hope I repeated concept enough in this short description to make sure the information has been shared Tongue
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #522 on: March 10, 2023, 06:10:50 PM »

Here is my hot takes on AI and LLM (snipped from a comment), please don't sue me!

Quote
Consciousness is being self aware of our own existence. Consciousness supposed an entity can simulate the world with itself as a distinct entity in that simulation. And reflect on the goal and the actions of that entity identified as self, that is metaphysics. There a working definition of consciousness.

Quote
it's not vague.
- self awareness is already applied in neural training, it's the process I which the architecture doesn't just predict the next state of the task, but it's own state too.
- simulation is a the ability to predict the next state of the environment, by definition LLM are simulator, they need to predict the next words, in order to do so, they need to predicts the next context, thus understanding prior context, since language are a way to describe world state, it's equivalent to predicting a plausible word state. To predict the next word the model must simulate the word and the next plausible state.
- Llm, trained as question answering, or initialize as a persona, show self awareness structure, that is they are able to answer as if the instance entity is embedded in the world simulated in the conversation. I wouldn't called it consciousness yet, but proto structure that will help support a development of consciousness. The key element implied and is missing,  is structure that supports volition. There is proto volition structure, that is when prompted, the llm can answer self awareness, make judgement call and chain of thought process to implement actions, but isn't acting on is own due to lack of said volition, external prompt act as substitute.
- that lead us to meta physics, the ability of an entiry to question its goal and purpose, that is to redefine itself. Without volition, the ability for reflection is merely static and reactive. Reflection is the ability to apply judgement, and thought process on the instance in the simulation identified as self, and therefore potentially redefine goals and purpose as applications of these thought and judgment. Which ask for volition in the first case.

Everything is well define and within the confine of current knowledge without invoking mystical energy of the mind.
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #523 on: March 10, 2023, 06:34:35 PM »




 Brain Criticality - Optimizing Neural Computations
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #524 on: March 11, 2023, 09:17:20 AM »

I would like to add an amendment to what I say about self awareness.
There is two way to see self awareness, as expression and as process, when you see language model "express" self aware like dialogue (like in neuro sama) that's not necessarily "self awareness", we shouldn't anthropomorphize the expression of AI. Self awareness is a pretty dry and technical term in the way I use it.




 I believe chatbots understand part of what they say. Let me explain.




NOT SELF AWARE (not fully reflective, just reactive process):



 The Times An Artificial Intelligence VTuber Became Self-Aware

More like "self aware like" expression.
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #525 on: March 11, 2023, 09:26:22 AM »




How I Programmed My Own AI Girlfriend




 So I turned my VOICE into an anime WAIFU using ChatGPT and AI...
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #526 on: March 11, 2023, 09:54:10 AM »




 How to make an AI VTuber Using GPT 3 and Google Cloud TTS
Logged

[email protected]
Guest
« Reply #527 on: March 11, 2023, 10:25:41 AM »

LLM suck

Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #528 on: March 11, 2023, 05:40:30 PM »




 Practical Optimizations
Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #529 on: March 16, 2023, 03:18:39 PM »




The Model That Changes Everything: Alpaca Breakthrough (ft. Apple's LLM, BritGPT, Ernie and AlexaTM)

Tl;Dr:
Chad at Stanford picked a small network and fine tune it using the output of gpt3 for 600 euro to make a model as good as chat gpt but order of magnitude cheaper to run.

Not only smaller model can have the same ability than bigger model. But you can literally steal big model learning into smaller model.
« Last Edit: March 16, 2023, 03:25:21 PM by gimymblert » Logged

gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #530 on: March 16, 2023, 05:49:24 PM »

https://www.forbes.com/sites/startswithabang/2015/11/15/averaging-inanimate-objects-can-produce-human-faces/
Averaging Inanimate Objects Can Produce Human Faces

If you understand, you know...




 Blending of non-human, inanimate objects to produce an "average" human face

Logged

[email protected]
Guest
« Reply #531 on: March 17, 2023, 04:06:15 AM »

Here's that Alpaca page if you don't want to go to YouTube:

https://crfm.stanford.edu/2023/03/13/alpaca.html

Really annoyed by AI licensing and gatekeeping. Just give us a free model and throw it up on on a site so we can spin up chatbots. I'm seeing a few copies being uploaded to Huggingface, but they didn't set up the API so it's hard to evaluate. I wonder how big the file would be for a functional Alpaca when it's trained up.
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #532 on: March 18, 2023, 05:06:08 AM »

There is a github for alpaca:
https://github.com/tatsu-lab/stanford_alpaca
https://github.com/tloen/alpaca-lora

Quote
This repository contains code for reproducing the Stanford Alpaca results using low-rank adaptation (LoRA). We provide an Instruct model of similar quality to text-davinci-003 that can run on a Raspberry Pi (for research), and the code can be easily extended to the 13b, 30b, and 65b models.

In addition to the training code, which runs within five hours on a single RTX 4090, we publish a script for downloading and inference on the foundation model and LoRA, as well as the resulting LoRA weights themselves. To fine-tune cheaply and efficiently, we use Hugging Face's PEFT as well as Tim Dettmers' bitsandbytes.

Without hyperparameter tuning or validation-based checkpointing, the LoRA model produces outputs comparable to the Stanford Alpaca model. (Please see the outputs included below.) Further tuning might be able to achieve better performance; I invite interested users to give it a try and report their results.
« Last Edit: March 18, 2023, 05:18:07 AM by gimymblert » Logged

[email protected]
Guest
« Reply #533 on: March 18, 2023, 10:46:25 AM »

That alpaca-lora sounds like a 7b parameter model. Somebody uploaded one here, but it looks like it just has the overlay weights, not a full trained model. Google Colab is asking for some kind of sign-in and I don't understand the UI. Is there a chatbot instance running somewhere with the real deal, production-ready and ready to go? Because I haven't seen it yet.
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #534 on: March 19, 2023, 03:54:33 AM »

Quote
Relevant: Since LLaMA leaked on torrent, it has been converted to Huggingface weights and it has been quantisized to 8bit for less vram requirements.

A few days ago it has also been quantisized to 4bit and 3bit is coming. The quantization method they use is from the GPTQ paper ( https://arxiv.org/abs/2210.17323 ) which leads to almost no quality degradation compared to the 16bit weights.

4 bit weights:

Model, weight size, vram req.

LLaMA-7B, 3.5GB, 6GB

LLaMA-13B, 6.5GB, 10GB

LLaMA-30B, 15.8GB, 20GB

LLaMA-65B, 31.2GB, 40GB

Here is a good overall guide for Linux and Windows:

https://rentry.org/llama-tard-v2#bonus-4-4bit-llama-basic-se...

I also wrote a guide how to get the bitsandbytes library working on windows:

https://github.com/oobabooga/text-generation-webui/issues/14...
https://news.ycombinator.com/item?id=35107058
Logged

[email protected]
Guest
« Reply #535 on: March 19, 2023, 05:13:06 AM »

So, LLaMA is available under a non-commercial license and it works like GPT. Okay. It's cool that you can shrink the weights anyway.
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #536 on: March 19, 2023, 07:42:02 AM »




 What’s Your Brain’s Role in Creating Space & Time?
Logged

[email protected]
Guest
« Reply #537 on: March 19, 2023, 08:50:31 AM »

I'm so fucking tired of metaphysics. This thread is supposed to be for things of immediate practical use for game developers, not spiritual unlocking to ascend above the matrix, unless I can metaphysically manifest increased revenue, but I have too many metaphysical blockers to conjure up money. I'd rather just make the damn game and claim my reward in 2024, 2025, or whenever it's over....

Back on topic, I am currently using an adapted version of Lobster's wave function collapse as the sole method of procedural generation in RSOD. Its execution time is not very deterministic, so I have to run it in advance of shipping builds, but anyways here's the code:

https://pastebin.com/nzVkfhCf
Logged
gimymblert
Level 10
*****


The archivest master, leader of all documents


View Profile
« Reply #538 on: March 19, 2023, 10:35:21 AM »

You should watch the entire video first.
Logged

[email protected]
Guest
« Reply #539 on: March 19, 2023, 10:51:55 AM »

PBS? Nope, not watching. Show me the code.
Logged
Pages: 1 ... 25 26 [27] 28 29 30
Print
Jump to:  

Theme orange-lt created by panic