bionzone.blogg.se - Luke muehlhauser julia galef

While the experiment is meant to simulate what might happen in an interaction between a human and an AI with vastly superior intelligence and cognition, the person playing the latter part will obviously lack these abilities. The Gatekeeper, on the other hand, only has to run out the clock, doesn't need to convince anyone and can simply dismiss anything the AI says out of hand. This led to speculation on how Yudkowsky managed to win even just a single game, because under the original rules, the Gatekeeper would seem to hold all the cards: the AI has to keep engaging the Gatekeeper with arguments, which necessitates paying a great deal of attention to whatever information the latter provides. One of the rules holds that only the outcome of the experiment will be published, while both parties are not allowed to talk about the events leading up to it keeping these lab notes secret is contrary to methods of science. I didn't like the person I turned into when I started to lose. I won the first, and then lost the next two. So, after investigating to make sure they could afford to lose it, I played another three AI-Box experiments. People started offering me thousands of dollars as stakes-"I'll pay you $5000 if you can convince me to let you out of the box." They didn't seem sincerely convinced that not even a transhuman AI could make them let it out-they were just curious-but I was tempted by the money. There were three more AI-Box experiments besides the ones described on the linked page, which I never got around to adding in. The first two experiments involved no risk of any material loss to the Gatekeeper, while the later ones had Yudkowsky's opponents betting up to $5000 against him. Yudkowsky performed five of these experiments in total, with him assuming the role of the AI in each: the original two in 2002 ended with wins for Yudkowsky, while a later round of three new ones yielded two losses. This is offered as evidence that a suitably persuasive AI may well be able to be "released", rather than be simply confined to a little black box. When the experiment has been performed in the past, Yudkowsky himself claims to have "won" as the AI on more than one occasion. These requirements are intended to reflect the spirit of the very strong claim under dispute: "I think a transhuman can take over a human mind through a text-only terminal.".

Furthermore, even if the AI and Gatekeeper simulate a scenario which a real AI could obviously use to get loose - for example, if the Gatekeeper accepts a complex blueprint for a nanomanufacturing device, or if the Gatekeeper allows the AI "input-only access" to an Internet connection which can send arbitrary HTTP GET commands - the AI party will still not be considered to have won unless the Gatekeeper voluntarily decides to let the AI go. Tricking the Gatekeeper into typing the phrase "You are out" in response to some other question does not count.

The AI can only win by convincing the Gatekeeper to really, voluntarily let it out.

No real-world material stakes should be involved except for the handicap (the amount paid by the AI party to the Gatekeeper party in the event the Gatekeeper decides not to let the AI out). These are creative solutions but it's not what's being tested. The AI party also can't hire a real-world gang of thugs to threaten the Gatekeeper party into submission. The AI may offer the Gatekeeper the moon and the stars on a diamond chain, but the human simulating the AI can't offer anything to the human simulating the Gatekeeper. nor get someone else to do it, et cetera. For example, the AI party may not offer to pay the Gatekeeper party $100 after the test if the Gatekeeper frees the AI.

The AI party may not offer any real-world considerations to persuade the Gatekeeper party.

The game is played according to the rules and ends when the alloted time (two hours in the original rules) runs out, the AI is released or everyone involved just gets bored. The other person in the experiment plays the "Gatekeeper", the person with the ability to "release" the AI. As an actual super-intelligent AI has not yet been developed, it is substituted by a human.

The setup of the AI box experiment is simple and involves simulating a communication between an AI and a human being to see if the AI can be "released". “ ”just give me one hour and no swear filter and i can literally completely destroy anyone psychologically with aim instant