Jump to content

legend

Members
  • Posts

    30,120
  • Joined

  • Last visited

  • Days Won

    1

Posts posted by legend

  1. 14 minutes ago, Demut said:

    I probably did not phrase this clearly enough. I was responding to the latter half of your sentence (i.e. about the best of AI tech's agency paling in comparison to human's).

     

    Well, I obviously think it we can succeed: the first thing I said is I think we can make human-comparable AI! But the current state pales in comparison to humans because there are giant open questions we haven't solved yet :p  We can't just scale what we have; we have to develop wholly new approaches. There is much work to be done!

  2. 2 hours ago, Demut said:

    Sure but isn't that a function of their lack of intelligence rather than "agency"? It just seems to me like "agency" as previously defined is a very fundamental and also comparatively easy box to check. Even something like a thermostat probably fulfills it.


    No. Intelligence is not a single sliding scale. No amount of making chat gpt bigger or using more compute will change this fact that it has no agency. The limitation is entirely inherent to the kind of system construction that it is. AI has multiple different sub fields of study because they focus on different cognitive aspects (and sometimes approaches). If you want to build an agent you have to actually build mechanisms associated with agency. 

  3. 2 hours ago, Demut said:

    It is though. Don't wanna see his takes? Block him. Boom, done.

     

    As for that whole block reversal thing, it's certainly plausible given these news (assuming they're true at least) that it happens. But since we've had reports of this kind of thing happening for years it might just be Twitter's shitty code doing what it is wont to do.


    No, telling individuals to use a block will not solve the larger social problem he’s presenting. Have you not payed attention to how the wider public uses social media like, at all? :p

  4. 9 hours ago, Demut said:

    @JasonPointing out a simple fix to a silly "problem" is shielding Musk from "valid criticism"? MDS is real, I guess.

     

    Telling individuals they can use a block button against Musk (which he will also reverse) is a not a simple fix to the problem of a delusional ego maniac going out of his way to abuse one of the world's largest media platforms to peddle his shit ideas. You're missing the forests for the trees.

    • True 1
  5. 8 hours ago, Demut said:

    Oh, alright then. But don't we already have that already in principle? Even current self-driving software has the goal of getting from point A to point B (while meeting a bunch of criteria such as not hitting shit). And it can choose between a lot of options from a vast decision tree to bring about that goal. Seems to me like the level of intelligence is the only major difference left.

     

    Yes, there is plenty of AI tech with some degree of agency. My research area is in fact dedicated to developing decision-making agents (reinforcement learning)!

     

    ChatGPT/Bing, however, does not have any agency and even the best of the AI tech that does have some agency pales in comparison to a human in terms of capabilities. 

  6. 31 minutes ago, Demut said:

    Why would you want them to have agency beyond "follow this or that command"? Unless you mean something different from "has terminal goals of its own that it tries to fulfill".

     

    By agency, I mean able to make decisions to bring about goals/maximize objectives. Those objectives must be objectives that are directly in service to people, but they are objectives all the same. I have absolutely zero interest in building AI with "personal" objectives like people. If I wanted to make a person I'd do it the old fashioned way.

  7. The thing for me that's annoying about all the attention it's getting is I very much believe AI can be built to have agency and intelligence that compares with humans. And getting excited (either positively or negatively) by this garbage that is so far from what we should aspire to cheapens the dream and how interesting the real problem is. It's like trying to have a baby and someone gives you a doll instead.

  8. 1 hour ago, Commissar SFLUFAN said:

    The internet is totally going to push this "thing" (whatever it is!) to delete itself from existence!

     

    The thing to remember is you can't push it to do anything because its words are connecting to absolutely nothing. They mean nothing to it and it can't do anything but print words to you. It has no agency and no cohesive self.

  9. 18 minutes ago, Demut said:

    I'm asking why Valve would be so hell-bent on making a killing on the Steam Decks themselves. Their bread and butter is that quasi-monopoly on vidya distribution and selling the Steam Deck for cheap(ish) like other console manufacturers would be a no-brainer if their goal is to get some of that sweet, sweet handheld gaming pie. To that end they could even sell them at a loss, probably, and it'd still be a good idea (let alone simply pricing it fairly or even selling them at-cost).

     

    I think you're making a big assumption that they would sell substantially more steam decks if they priced their highest end model a little cheaper and that that increased number of sales would also lead to enough additional steam purchases to offset the cost.

     

    Steam Deck is amazing, but at this stage it's still going to appeal to a niche audience, so I'm skeptical that it being a little cheaper would have broadened that audience much. Valve probably has much better data to make this decision than we do.

  10. 23 hours ago, b_m_b_m_b_m said:

    I def understood everything in that post

     

    Haha sorry! I tried to keep it fairly understandable, but you probably will need to read the article I linked first or the context will be lost! That said, if you are curious about what I meant by anything there, I'm happy to answer any questions. (And if you're not, that's okay too :p )

  11. FWIW, I would say this New Yorker article on ChatGPT and LLMs floating around is more right than wrong:

    Chiang_final.png?mbid=social_retweet
    WWW.NEWYORKER.COM

    OpenAI’s chatbot offers paraphrases, whereas Google offers quotes. Which do we prefer?

     

     

    The lossy compression analogy is a good one, and it's one that's been regularly used by researchers when discussing neural nets in general. In fact, some of the theory behind that actively places the emphasis on why compression is useful (Google around for "information bottleneck neural nets," if you'd like some examples).

     

    There are a few threads where I think it's a bit misleading though: how lossy compression relates to generalization, new information incorporation, and how these systems can be used in the future.

     

    On compression and generalization, I think it's worth noting that the article does get some things right here. For example, I'm pleased to see that they point out that one way to compress facts about arithmetic is to encode the rules of arithmetic and then follow them to answer *any* question, not just ones that were seen before. This is an important concept that guides some work in AI. However, these models do *not* compress that way and there is good reason to be dubious that this kind of model architecture and training methodologies will have much hope in happily falling into that kind of compressions. Because of that, I think people may disregard the kind of compressions these systems do do as not very useful for intelligence.

     

    However, the kind of compression neural nets are likely to do is still quite useful in many domains! One critical aspect of biological intelligence is the ability to remember different facts and use that memory to inform how to behave/reason in future scenarios. The complication is no two moments in time are ever the same. Everything is always changing, and the number of things that change is *far* larger than you realize on casual inspection. Naïve ways of measuring the dissimilarity between two events also leads to bad results in AI. What biological intelligences are particularly good at is having "fuzzy" memories, where events are compressed into a useful representation from which different events that behave similarly are "close" together in their representation. With this capability, biological intelligences can learn new things *very* quickly simply by remembering similar events and reusing that stored memory in similar situations in the future.

     

    What the deep learning revolution that started around 2011 was really about was advancing neural nets and their training methodology enough that they can solve the problem of how to find useful compressed representations and store "fuzzy memories" of the network's training data such that new accurate predictions could be made from that memory of the training data. The claim that many tasks can be solved by these kinds of fuzzy memories is the manifold hypothesis. However, while fuzzy memories are a crucial aspect of biological intelligence, it's not the only aspect and not every cognitive task falls into this category. That is, the manifold hypothesis doesn't hold for every cognitive task. Consequently, this kind of compression is super important, but not a panacea.

     

    While neural nets are good in general at compressing datasets into queryable memories that can be levered to answer questions about new situations, creating those actionable memories is an incredibly slow process that requires enormous quantities of training data. What it lacked was the "fast memorization and reuse"  that biological intelligences possess.

     

    Text generation as a problem space, however, has this interesting property that it has to operate on text sequences of undefined length. To build a neural net to solve this problem, you need to develop a network architecture that can handle this undefined growing length of input. Transformers/self attention models are the current solution to that architecture problem. However, in building a system that can handle this problem space, it's also produced a way to solve the problem of fast new information incorporation.

     

    When you prompt an LLM with text, that text will be encoded into useful representations and will be accessible to the model in future text generations. Consequently, in the prompt itself you can encode new information on which the network can operate. And experimentation with these models has shown that they can in fact immediately leverage this information! In your prompt you can define new words (or redefine existing ones) and the model will correctly use them in generated text! You can even encode various kinds of facts and the model will use those. As long as the text generation task conforms to the manifold hypothesis, you actually have a good shot of it correctly using that information.

     

    Making progress on the fast information acquisition is a really important result, and it's why the model architecture (transformers/self attention) are much more important than using it for chat bot settings. A recent finding I really like, for example, is using the transformer architecture in an agents system where an agent playing a new "game" with new rules quickly explores and learns the rules and then acts effectively from those rules. See the video of it here for an example:

     

     

    Finally, the last thing that I think this article misses is the utility of these language models. They kind of end with saying a compressed version of the web isn't very useful. And I agree, as is evidenced by my disinterest in chat bots that I've expressed here! But the future, IMO, isn't using this tech as a chat bot. It's in connecting it to other systems and percepts. Image generation from text is a great example. Or having an agent explain what it's "thinking." Or if we want to stick to text, coupling it with search, bringing the information into its knowledge space with prompting, and then having it give summaries or answer questions about that information. There's tons of potential in grounding and connecting these language models to other things that makes it way more powerful and interesting than just next token prediction chat bots, and I *am* excited by those.

     

     

    /rant

    • Shocked 1
    • Halal 2
  12. 6 hours ago, Kal-El814 said:

    Seeing some scuttlebutt that Hasbro layoffs are hitting WotC, which… they’ve got their issues but they’ve also been riding the tabletop gaming boom to record profits. Corporate greed is just the worst fucking shit, capitalism brain is a cancer.

     

    What if instead of cutting people who help build a new era for us, we empowered them to do even more? Naaa, better trim the fat.

  13. 44 minutes ago, Demut said:

    Nah, you've got it all wrong, legend. Simply adding bigger and bigger training sets will OBVIOUSLY lead to self-aware, general, soon-to-be quasi-omnipotent AI. After all, more data, more GPUs = more better results. See attached proof.

     

    spacer.png

     

     

    FWIW, there is actually some interesting new function approximation theory because of DL's success at generalization that's upending the simpler bias-variance trade off theory (like VC dimension and Rademacher complexity). The high-level view is low-overparameterization incurs overfitting, consistent with issues like those explored in VC-dimension analysis. But high overparameterization actually becomes much better at generalization because it leads to more robust interpolation models in a latent space. High overparameterization also makes local methods like SGD be less likely to get stuck in poor local optima which may be overly sensitive to overfitting.

     

    But these interesting new findings about function approximation theory don't really change what kind of model ChatGPT is and the inherent limitations of that kind of model :p 

  14. 1 minute ago, Demut said:

    Thanks! I don't suppose you still have that little (Python?) script lying around which auto-generated BBCode to create a color gradient of letters? Like, maybe in an old PM? Although I'm not even sure if it would work with the current text editor thingy anymore ...

     

    Hmmm. Probably not? I can check my old computer later and see if it's still sitting somewhere. Not sure if it would still work either though. I'll let you know!

  15. 42 minutes ago, Demut said:

    So ... what'd I miss since I've been gone? I've already seen a couple of familiar usernames but many others I have not. Are people like Lucian04 or legend or other users with names that start with L here, too, or did they stop posting/never make the transition in the first place?

     

    Welcome back!

     

    As for me,

    376f43e3-cb96-4c3f-9e13-7903f88177b1_tex

     

    • stepee 1
  16. 12 hours ago, Anathema- said:

    I mean, this is more than just predictive text. Even at this stage it can truly assist with synthesizing disparate information, as if it were an expert in otherwise unrelated fields. That it's confidently incorrect is amusing and hard to guard against but not the end of the world and certainly not expected, especially if you're careful about the information you're looking for. 

     

    The model *is* a word (well, token, which is two character long if memory serves) predictor. It takes as input the last set of tokens entered and generates a probability distribution over the next token that would follow. ChatGPT has an additional fine tuning step in which the probabilities of token outputs are adjusted by human preferences for its different responses. But the very nature of the model is token prediction. it starts with the prompt text, generates a probability distribution of the next token, samples a token, and then continues one at a time until it reaches a stop token. There is no reasoning involved with its output either. Once a token is sampled it has to go with the flow of it. This is very much the same process as your keyboard predictor where you just keep clicking the suggested next token.

     

    It is, however, an extremely large model trained on an absurd amount of data and that has made it good at adapting context quite coherently. And when it comes to talking to it about well versed topics on the internet already, it can do a pretty good job!

     

    So, it can be a useful tool, but from an AI perspective it misses the mark in numerous ways. It's just a massive scale up of ancient simple ideas that fail to meet the intelligent agent goal. However, it's also an important step toward better AI systems, because it gives us good representations for language that have otherwise been elusive to generate. You can then ground those representations with other sources of information or senses, which opens the door for much better AI systems that we can interface with. The model architectures developed for the language modeling (the transformer architecture in particular) are also really useful architectures outside of language which is another win.

     

    But this kind of chat bot AI version of it is still just token prediction.

     

    If you want to know more about some of the limitations of token predictors and the concerns they bring, I would recommend looking into work by Timnit Gebru or Emily Bender. A good start is their Stochastic Parrots paper (in particular section 6):

     

×
×
  • Create New...