1/7/12

Comments from the knol

Well, that didn't work out. 3.5 years, ~20K views, & 69 comments latter, I am back on the blogger.
This post contains old comments on the knol.
Below that is ancient history, in case you want to see how things progressed.

Derek Zahn:

Understanding another human being's thoughts is hard. :)


Hi Boris,

Sorry for the delay... I wrote a long time ago something to the effect that I like to try understanding the ideas of other researchers working on AGI-related theories (at least those that seem to have some hope of being interesting) and wanted to try and understand yours. I have returned to your pages once in a while but have great difficulty even starting to try and get a grip on what you are writing about. Part of the blame for that is the difficulty of the subject matter, part is that I'm just not very smart, but mostly (and frustratingly) it is simply very hard for human beings to communicate with each other -- when reading, we have to fill in so much from our own viewpoint and experience, and that is a very error-prone process. So, although I'm afraid that my questions will be stupid and nitpicky and possibly a waste of time to answer, they are the only way for me to figure out how to interpret what you are saying. On the plus side, maybe any clarifications you make for me would be useful for other readers as well.

Although general motivations, and criticisms of other AI approaches can be fun, I'm going to ignore that stuff unless it becomes critical for my main purpose, which is understanding your theory in its current state.

One way to facilitate communication is to develop a concrete frame of reference as a starting point. So: although I imagine your theory is intended to be very general in nature (and thus applicable to a variety of agents and environments), it is helpful for me to pick a particular case, so that general points can be applied to this concrete situation... very general abstract theories are almost impossible to communicate from one person to another because there are so many possible interpretations of language; having a concrete situation as a reference will help me fill in some meaning.

So: Suppose I have a robot roaming around my neighborhood. It has one sense modality: a black-and-white video camera affixed to the front of the robot. At fixed intervals (say 30 times per second but the exact rate isn't important), a video frame gets digitized and handed to the "intelligence" program implementing your theory. Although it won't be needed for a while :) suppose that the robot has tank tracks for its drive and a signed output signal controls the speed of each side track.

Can we use this system as a concrete reference? Is it missing something needed for your theory to apply to it?

Assuming it's okay... from your description, I understand that you save a history of past input frames, indexed by their offset into the past. You also compute the derivative between two successive frames on a per-pixel basis using the numerical difference between pixel values at each point.

The goal is to predict the pixel values in the next input frame.

Ok, let me stop there to make sure we are on the same page. Comments? If you don't have time to mess with what is likely to be a bunch of incomprehension on my part, I understand.... in that case, just don't respond to my comment. :)

Take care,
Derek
http://supermodelling.net




Last edited Sep 20, 2011 4:34 PM
Report abusive comment


Sorry for being difficult, Derek!

The problem is, to be on the same page we have to be on the same level of generalization: = decontextualization. You‘re asking for conctrete examples. While it is (theoretically) possible to explain how my algorithm will act in simple cases, such examples will not impress you. You’ll need to understand why I think it can scale to complex cases, & that reasoning is necessarily *abstract*. But, for some mysterious reason, you do find my approach interesting, so I’ll try:

> video frame gets digitized and handed to the "intelligence" program implementing your theory.

Actually, my theory *includes* digitization as the first step of compression, which maximizes correspondence_per_cost: my overall fitness function. This is important because these steps form a pattern, which must be indefinitely projectable, for the “program” to scale in complexity of such algorithms.

> I understand that you save a history of past input frames, indexed by their offset into the past. You also compute the derivative between two successive frames on a per-pixel basis using the numerical difference between pixel values at each point.
The goal is to predict the pixel values in the next input frame.

Given a non-random environment, every input *is* a prediction for subsequent inputs (no prediction is 100% certain anyway). These inputs have incremental dimensionality: from 0D pixels, to 1D patterns: sequences of matching pixels, to 2D patterns: sequences of matching 1D patterns, & then to 3D, TD, & discontinuously matching patterns.
This is an indefinitely expensible hierarchy, where older inputs (history) are selectively stored (patterns vs. noise) & searched on higher levels. Each of higher-level patterns is a "prediction" for lower-level inputs. 2D frames have no special status in my approach.

Notice that I start by defining match. Then I define a pattern as a set of matching inputs, & derivatives are computed by comparing among individually selected (stronger than average) patterns within corresponding level of search. This is not an indiscriminate all-to-all indexing, that would be a transform. These derivatives then form vectors to project their patterns (further refining their predictive value), & to form their own patterns. All of that is selective (according to predictive values of each variable), otherwise you get a combinatorial explosion.

Report abusive comment
Posted by Boris Kazachenko, Sep 14, 2011 4:24 PM
Hi Boris,

You're right that we have to be on the same level of decontextualization; I was hoping to drag you down to my level :) because if we refer to concrete things (like that robot) there is less room for misunderstanding. If I generalize into abstractions I won't end up the same place as you because my abstractions aren't the same as yours... and the result is that I don't know what the words you use are supposed to mean.

I don't care about "impressiveness" on simple examples, just clarity.

I'll try to climb into the clouds, but it will probably take a while. :) So, a few questions to start with:

> correspondence_per_cost: my overall fitness function.

Correspondence of what? Measured how? What does "cost" mean and how is it measured?

> every input *is* a prediction for subsequent inputs (no prediction is 100% certain anyway).
> These inputs have incremental dimensionality: from 0D pixels, to 1D patterns: sequences
> of matching pixels, to 2D patterns: sequences of matching 1D patterns [...]

I certainly get that an input *can be used as* a prediction for subsequent inputs (by an entity whose goal is prediction, for example -- with a prediction algorithm), and for some inputs in some environments (like the robot example) there will be a correlation between in(t) and in(t+1). Other kinds of "inputs" (say... the value of an audio sensor in a las vegas casino sampled every 41 hours) may not have any discernable correlation at all. But I don't think it's right to say that an input *is* a prediction, which is a confusing conflation of terms.

I don't understand what you mean by "inputs have incremental dimensionality". Incremented by who? How did "1D patterns: sequences of matching pixels" become an "input"? You say "pixels" which implies a visual semantics for an input...

Maybe these questions illustrate the confusion I experience when I even begin to try and understand what it is you are talking about...

Thanks!

Derek
http://supermodelling.net

Report abusive comment
Posted by Derek Zahn, Sep 17, 2011 10:36 AM
> If I generalize into abstractions I won't end up the same place as you because my abstractions aren't the same as yours... and the result is that I don't know what the words you use are supposed to mean.

My meanings are the most basic (decontextualized) possible, you *will* end in the same place if you just let go of your context (scary, I know). We all work off the same innate “algorithm”. If our generalizations don’t agree, then either we’re on different levels, or the level is too low for both of us.

> I don’t care about "impressiveness" on simple examples, just clarity.

But there must be a reason for you to *work* on understanding me, rather than a bunch of other things.

> Correspondence of what? Measured how? What does "cost" mean and how is it measured?

See section 1: definition of match, then of incrementally derived projected match. Cost (memory + operations) is initially the same for a basic comparison, so you normalize for it by subtracting average match from the prior search cycle: # comparisons. I’ve tried to explain all this in the knol, let me know what part is unclear. Beyond the first cycle, the cost is multiplied by additional # & power of comparisons, represented in the resulting patterns.

> I certainly get that an input *can be used as* a prediction for subsequent inputs…

It’s more basic than that, *any* prediction must be derived from past inputs. But these inputs have varying “predictive value”, both overall & for specific sources: lower-level locations. Patterns are inputs for higher levels, each representing multiple matching lower-level inputs. I try to quantify all of that.

> I don't understand what you mean by "inputs have incremental dimensionality". Incremented by who?

By comparing lower D patterns across higher-D coordinate, on a higher level of search.

> How did "1D patterns: sequences of matching pixels" become an "input"?

This is a hierarchy Derek, above-average lower-level patterns *are* higher-level inputs.

> You say "pixels" which implies a visual semantics for an input...

That's simply a visual version of maximal resolution 0D input, there is an equivalent in any modality.

Report abusive comment
Posted by Boris Kazachenko, last edited Sep 17, 2011 2:40 PM
Hi Boris,

I'm interested in understanding you because I am curious about all serious detailed theories of intelligence. There are many different approaches to this, and I'm interested in any that have significant amounts of precision or detail and seem intuitively plausible (as opposed to shallow, fundamentally incoherent, inconsistent, or simply insane). The trick is understanding them. It would be relatively easy to convince myself that I understand you at a rough overview level... but such characterization just feeds my ego, it doesn't (usually) increase my actual knowledge or insight.

In an approach like yours, I am most interested in a few interrelated particulars (in as much detail as I can manage): the "language" that is used to express patterns at each level of abstraction (as a combination of inputs, or more), the specific way that temporal relationships are incorporated into patterns, and the method used to individuate patterns as learned entities. I am fairly certain that you believe you have explained all these things in your knols, but I have not yet succeeded in extracting this information from your text. I also believe that other people bounce off of your writing for similar reasons. You say that your meanings are the most basic (decontextualized) possible, but natural language doesn't work that way, and in fact the meanings of what you write are largely embedded in your own context; failure to recognize this is what causes incomprehensibility. Although we all share a lot of cultural context, we are islands in many ways, and we have to build stepping stones to cross the deep and murky inferential gaps.

I think I will try to take into account everything you have said in this conversation, along with your conversation with Ben on the AGI list, and start over from the beginning. I'll return after I have bashed away at that for a while. If you care to say anything more about the things above that I mentioned as particularly interesting, that would be cool.

Thanks for taking the time to answer my questions, and I wish further success for you as you continue to develop your ideas!

Derek Zahn
http://supermodelling.net

Report abusive comment
Posted by Derek Zahn, Sep 19, 2011 9:28 AM
> You say that your meanings are the most basic (decontextualized) possible, but natural language doesn't work that way, and in fact the meanings of what you write are largely embedded in your own context; failure to recognize this is what causes incomprehensibility.

Maybe you can point out my biases, I promise to exterminate them without mercy :).

> I think I will try to take into account everything you have said in this conversation, along with your conversation with Ben on the AGI list,


That definitely turned you off :).

Report abusive comment
Posted by Boris Kazachenko, Sep 19, 2011 10:01 AM
Derek> I am fairly certain that you believe you have explained all these things in your knols, but I have not yet succeeded in extracting this information from your text. I also believe that other people bounce off of your writing for similar reasons.

I guess this is a deliberate filter - too much of explicit explanations may make it seem too easy and obvious.

Report abusive comment
Posted by Todor Arnaudov, last edited Sep 22, 2011 3:38 AM
Derek: I am most interested in a few interrelated particulars (in as much detail as I can manage): the "language" that is used to express patterns at each level of abstraction (as a combination of inputs, or more), the specific way that temporal relationships are incorporated into patterns, and the method used to individuate patterns as learned entities. I am fairly certain that you believe you have explained all these things in your knols...

Boris: Yes I did, the "language" (I prefer "syntax") is simply a record of past operations, assigned to the data they produced. I tried to explain the initial set of such operations, & general principles that drive the expansion of this set.

Todor: I guess this is a deliberate filter...

Boris: it's partly deliberate in a sense that examples may mislead people into thinking that they understand the generalization, while in fact they only understand the examples. But mostly it's because creative writing is not my top priority, - I have work to do. And this is an exceptional problem, so most people *should* "bounce off".

Todor Arnaudov:

Higher match within derivatives in a pattern, than between templates and lower level output:

Boris: "I won’t get into details here, but a higher level of feedback should suppress empirical data entirely, & select only the operations that process it. That would result in purely algebraic equations, “compared” to achieve mathematical compression. We can expect that better math will facilitate future discovery of empirical patterns, but at the cost of reduced correspondence of current memory contents."

Todor: This is maybe another phenomenon or not elaborated enough or wrong, but I made it up after reading this paragraph.

(1) Higher level patterns get complex - long, carrying lots of derivatives and heavy operations.

(2) Comparison gets more expensive than the predictive benefits. In the past the level may have been more predictive, but if it gets expensive to support it, the level can be either optimized or lost to free resources. While optimizing, higher level suppress lower because now it doesn't expect benefits from the new lower level input.

(3) The derivatives in long patterns (in any length patterns) can turn into local coordinate spaces on their own. A hierarchy on the derivatives is initiated, as if they (or selected parts of them) were raw sensory inputs - some derivative become "x", another "y", another "iB" etc. Longer patterns are more likely to have linear dependencies and other correlations, and patterns within patterns will be discovered.

(4) The process of (3) can start if high matches - within the patterns themselves, or between templates at the same level (let's call them InternalMatch) - are discovered. I suspect - when this InternalMatch is higher than the match between this level and the output from the lower level.

(5) In brief, if it once gets more predictive and cheaper to do the algebra using the already collected higher level derivatives, than to compute and store new high level derivatives from the lower level derivatives (input), then do the algebra and stop accumulating more "junk" data.


Last edited Sep 2, 2011 4:29 PM
Report abusive comment
> Comparison gets more expensive than the predictive benefits. In the past the level may have been more predictive, but if it gets expensive to support it, the level can be either optimized or lost to free resources.

You know I don’t like meaningless words like “optimized”. If a variable or a whole pattern becomes less predictive than the average per resources used, then it simply loses resolution: lower bits of value &| of coordinate (through aggregation across them).

> While optimizing, higher level suppress lower because now it doesn't expect benefits from the new lower level input.

That’s what any feedback is for.

> The derivatives in long patterns (in any length patterns) can turn into local coordinate spaces on their own...

These sub-coordinates are formed | incremented with every new type of derivative. It’s not an optional process, you need them for selective access. Comparing across these “syntactic coordinates” is how you get higher powers of comparison (by division: iterative comparison between difference & match, etc.), dimensional proportions, & so on. But you’re right, I should make it more explicit.

> between templates at the same level (let's call them InternalMatch) - are discovered.

Between templates is not “internal”. You don’t compare across external & across syntactic coordinates at the same time, - that’s not incremental in complexity.

> I suspect - when this InternalMatch is higher than the match between this level and the output from the lower level.

Actually, comparison across syntax is done after evaluation before output, initially if its (across-level projected match) * (internal syntactic span) = *above average*. That means it’ll search on higher level, *&* is likely to be compressed by intra-comparison, which makes the search easier. It’s only after such intra-comparison that you can project & prioritize internal match independently from the external kind.

> In brief, if it once gets more predictive and cheaper to do the algebra using the already collected higher level derivatives, than to compute and store new high level derivatives from the lower level derivatives (input), then do the algebra and stop accumulating more "junk" data.

None of the above is about algebra. Internally or externally, you’re still comparing data, not operations. Comparing operations means comparing syntactic coordinates themselves, that’s what they stand for. (Can you think of initial types of such comparison?).

“Algebra” by itself is not predictive, it only gives you shorter equations to compute predictions from future data. It’s still all about data in the end, but math lets you be more selective in collecting it.

Report abusive comment
Posted by Boris Kazachenko, Aug 29, 2011 1:23 AM
>You know I don’t like meaningless words like “optimized”. If a variable or a whole pattern becomes less predictive than the average per resources used, then it simply loses resolution: lower bits of value &| of coordinate (through aggregation across them).

OK, I know about lowering the resolution to increase match. "Optimize" here - to make comparison of the same derivatives/at the same level cheaper by finding correlations within the level data and between derivatives in a pattern.

>> The derivatives in long patterns (in any length patterns) can turn into local coordinate spaces on their own...
>These sub-coordinates are formed | incremented with every new type of derivative.
>It’s not an optional process, you need them for selective access. Comparing across these “syntactic coordinates” is how you get higher powers of comparison (by division: iterative comparison between difference & match, etc.), dimensional proportions, & so on. But you’re right, I should make it more explicit.

OK, so that's when it's done (from the knol): "the power of comparison is increased if current match-per-costs predicts further improvement, as determined by “secondary” comparison of results from different powers of comparison, which forms algorithms or metapatterns."

>> While optimizing, higher level suppress lower because now it doesn't expect benefits from the new lower level input.
>That’s what any feedback is for.

I mean after a reliable formula is inferred, giving sufficiently high match/prediction, new lower level samples to improve prediction are not necessary.


>“Algebra” by itself is not predictive, it only gives you shorter equations to compute predictions from future data. It’s still all about data in the end, but math lets you be more selective in collecting it. (...)
>None of the above is about algebra. Internally or externally, you’re still comparing data, not operations. Comparing operations means comparing syntactic coordinates themselves, that’s what >they stand for. (Can you think of initial types of such comparison?).

Not yet. However it seems there are not many combinations. There are position within the internal variables, levels in the sub-coordinate hierarchy of this position; basic comparison operations are just a few; iteration is supposed to be repetition until given match/miss is achieved (above/below average or so).

Report abusive comment
Posted by Todor Arnaudov, last edited Sep 1, 2011 2:47 PM
> OK, so that's when it's done (from the knol): "the power of comparison is increased if current match-per-costs predicts further improvement, as determined by “secondary” comparison of results from different powers of comparison, which forms algorithms or metapatterns."

That’s only the first step: a comparison across adjacent derivation orders (syntactic coordinates). Beyond that are comparisons across syntactic discontinuity, such as between lengths of different dimensions within a pattern, & so on. I’ll make separate chapter on syntax in the next edit, coming soon. That’ll include the “algebra” part, it really doesn’t belong in the feedback section.

> I mean after a reliable formula is inferred, giving sufficiently high match/prediction, new lower level samples to improve prediction are not necessary.

I think you mean reliable *pattern*, algebraic formulas are not predictive per se. In that case, *local* sampling is suppressed by expectations, but in favor of more distant sampling. I covered that in the section on feedback: “Downward suppression of locations with expected inputs will result in a preference for exploration & discovery of new patterns, vs. confirmation of the old ones”.

> Not yet. However it seems there are not many combinations. There are position within the internal variables, levels in the sub-coordinate hierarchy of this position; basic comparison operations are just a few; iteration is supposed to be repetition until given match/miss is achieved (above/below average or so)

There is an infinite number of potential combinations, the trick is to explore them incrementally. Re iteration, it continues till match/cost is exhausted, not achieved.

Andrey Panin:

Boris, interesting perspective

creating general AI is a very addictive problem I have to say - the one that fools many into thinking it that it's solution is just around the corner alas... all existing approaches lead to dead end. I share your hope that there are structural reasons for that such as either real world constraints that force those working on it to be practical in short term leading them to specialized solutions, or lack of knowledge in those for whom this is just a hobby. I am curious if you made any progress in the years since you posted this knol? Also I am curious to know if you discounted connectionist approaches (for anything other than perception) in favor of algorithmic/symbolic approach, or you think it's a false dichotomy?

My personal feelings is that solution will be in form of NN because I haven't seen anything else come conseptually close to linking what at first seem like a completely unrelated peaces of information.


Last edited Aug 8, 2011 10:04 AM
Report abusive comment
Thanks!

My knol is continuously updated, last time only a month ago. I am making a "theoretical" progress, - simulation would be pointless since I refine the algorithm almost daily. What is it that you find interesting, &| unclear? I make no hard distinction between perception & "conceptual" levels, it's just a degree of generalization. Connectionist approach is not analytical enough, I think on the level of algorithms: nodes, not networks. Also, as I mentioned in the knol, it's not incremental enough, thus not scalable. I add one dimention at a time, starting from 0D, NNs start from 2D.

Report abusive comment
Posted by Boris Kazachenko, Aug 4, 2011 2:41 PM
I am interested in how far from completion do you think your algorithm is? enough to try it out because I am sure you know that no matter how nice a theory is unless it's tested you can never be sure of what you have.
When you say NN is not scalable I hope you mean current implementations, but in theory it's the most scalable thing known as of now, by it's concept - since our brain is but one version of it. Regarding dimension, not sure I see the limitation. A network of 1 node is 0D isn't it?
What attracts me about NN is the concept of emergence of complexity out of simple units, seems that it the underlying force in nature. To me over relying on an analytical approach is too brave of a step since it's basically saying - we will find an alternative way to recreate intelligence other then the path we know already leads to one. To me it's like trying to understand the intricacies of an anthill through architectural focus rather then by generalizing from a unit of interaction between one ant and another ant.

With regard to solving the intelligence issue one of the issues I find most challenging (apart from infinitelly many others) is this: How to implement an inner drive (aka motivation) - that gives rise to switching and focusing attention, because without it an intelligent system would either be completely sensory driven (without internal importance filter) or it would engage in infinite pattern searching of one random (and could be completely useless) problem. How do you address this problem?

Report abusive comment
Posted by Andrey Panin, Aug 4, 2011 9:03 PM
> I am sure you know that no matter how nice a theory is unless it's tested you can never be sure of what you have.

If you're not interested in a theory, you're talking to a wrong guy. I don't care for blind tinkering.

> When you say NN is not scalable I hope you mean current implementations, but in theory it's the most scalable thing known as of now, by it's concept - since our brain is but one version of it.

It's matter of interpretation. ANNs have very little to do with real neurons / columns, to understand the latter you should be a neuroscientist. That's a legitimate route, guess I am too "brave" for that.

> Regarding dimension, not sure I see the limitation. A network of 1 node is 0D isn't it?

The limitation is inefficiency. Adding 1 dimension per level of search lets you select only the lower-D patterns that are strong enough to the carry the overhead of additional coordinates. Without incremental selection you hit combinatorial explosion. Predictions are vectors, you can't have them without explicit coordinates.

> To me it's like trying to understand the intricacies of an anthill through architectural focus rather then by generalizing from a unit of interaction between one ant and another ant.

My approach *is* bottom-up, I start from pixels, you can't get any lower than that. But I do so using criteria derived from my definition of intelligence, without one you're flying blind.

> How to implement an inner drive (aka motivation) - that gives rise to switching and focusing attention, because without it an intelligent system would either be completely sensory driven (without internal importance filter) or it would engage in infinite pattern searching of one random (and could be completely useless) problem. How do you address this problem?

The only drive I care about is curiosity, - a cortical instinct. It's implemented by introducing a universal selection criterion, - predictive power. I am perfectly fine with "sensory-driven", the rest is either gross physiology or acquired through conditioning.

Report abusive comment
Posted by Boris Kazachenko, Aug 4, 2011 10:23 PM
your definition of intelligence gives too broad of a range to be really useful as a discriminatory tool. An animal that hunts predicts and plans, computer playing chess predicts and plans, a 2 year old child predicts and plans, a semi retarded person predicts and plans etc. What matters is how wide is the scope of prediction, how good is the planning need to be - to be considered a success in achieving AGI. Currently success is very incremental which brings about a moving target in terms of what would and would not be considered AGI. I would be very curious if someone would actually discover a suitable intelligence criteria.

a computerized version of a neuron is first and foremost a conceptualized version - the reason I think building algorithmic AGI is braver then building a NN AGI is because the order of complexity is very different. It's much easier to concentrate on a small "unintelligent" building block (i.e neuron) which, once conceptualized correctly will lead to intelligence, vs trying to reconstruct intelligence directly, wouldn't you agree?

Regarding pixels and 0 dimension - brain is powerful enough that even being blind and deaf - and only heaving access to touch - which is very crude input source - it is still able to acquire a picture of the world. So I am not convinced that dealing with 0 dimension is really required for true intelligence.

Curiosity is an interesting criteria, but I don't think is sufficient - imagine an autistic person staring for hours/days at a flame because he/she curious to find a pattern to it and predict the way flame will look pix by pix a few minutes later. Something inside us tells us "that's not important - move on". No way getting around a necessity of having internal selection criteria that would say what's important and what's not, don't you think?

Report abusive comment
Posted by Andrey Panin, Aug 5, 2011 2:01 PM
> your definition of intelligence gives too broad of a range to be really useful as a discriminatory tool. An animal that hunts predicts and plans, computer playing chess predicts and plans, a 2 year old child predicts and plans, a semi retarded person predicts and plans etc. What matters is how wide is the scope of prediction, how good is the planning need to be - to be considered a success in achieving AGI.

Precisely, intelligence is a matter of degree, & I am suggesting a way to quantify & maximize it. What are you arguing against?

> I would be very curious if someone would actually discover a suitable intelligence criteria.

"Suitable" is a two-way street.

> It's much easier to concentrate on a small "unintelligent" building block (i.e neuron) which, once conceptualized correctly will lead to intelligence, vs trying to reconstruct intelligence directly, wouldn't you agree?

I don't think it is conceptualized correctly, otherwise we'd have intelligent computers running around. You don't know what's easier till you've done it, Markram now wants ~1B$ & 10 years to get "close" doing it. What I do know is that there are ~250K neuroscientists beating around the bushes, & 1 of me making good progress theoretically. It takes guts to do AGI.

> Regarding pixels and 0 dimension - brain is powerful enough that even being blind and deaf - and only heaving access to touch - which is very crude input source - it is still able to acquire a picture of the world. So I am not convinced that dealing with 0 dimension is really required for true intelligence.

Pixels is just an example of 0D processing, any sense would do, though not as well as vision.

> Curiosity is an interesting criteria, but I don't think is sufficient - imagine an autistic person staring for hours/days at a flame because he/she curious to find a pattern to it and predict the way flame will look pix by pix a few minutes later.

Curiosity is a motive, in psych. terms, a criterion is predictive power. You need expand your scope of experience to maximize it, though specific scope vs. precision trade-off depends on the noise in the inputs, & on subject's time horizon. It's the same for autistics, they just put relatively greater value on precision.

Report abusive comment
Posted by Boris Kazachenko, last edited Aug 5, 2011 6:17 PM
>Precisely, intelligence is a matter of degree, & I am suggesting a way to quantify & maximize it. What are you arguing against?

I guess I missed/misunderstood the part where you quantified it, do you mind restating it for my benefit how do you quantify it? That would answer my intelligence test question as well.

> 1 of me making good progress theoretically

hence my question about how far from completion are you (as defined by your own min intelligence test) - do you have all necessary components in place (alas even if in unrefined state) or are there some that you are still have to solve?


> You need expand your scope of experience to maximize it, though specific scope vs. precision trade-off depends on the noise in the inputs, & on subject's time horizon

I disagree that scope vs precision depends on inputs only. It has to depend in large part on internally set goals/values. Inputs don't assign values - with no values trying to understand the complexity of a dust mite is the equivalent to understanding how to solve humanity's garbage crisis. I think unless we give an internal drive/goal/value criteria - intelligence produced by us will be a) useless b) will be difficult to test since it may appear autistic to all.


Report abusive comment
Posted by Andrey Panin, Aug 6, 2011 7:43 AM
> I guess I missed/misunderstood the part where you quantified it, do you mind restating it for my benefit how do you quantify it? That would answer my intelligence test question as well.

This whole knol is about that. 2nd paragraph: "the criterion must be predictive correspondence of recorded inputs.., - their cumulative match to future inputs".
I then quantified match on a single-variable level, latter relative & unique match (2nd section), then introduced projected match (vs. contrast) & additive projection (vs. confirmation) in the 3rd section.
More abstract forms of correspondence (cumulative match) are defined by incrementally complex algorithm, but allow for greater scope * precision of prediction.

> hence my question about how far from completion are you...

Completion is when the algorithm can self-improve (add efficient complexity) through computer simulation faster than I can improve it theoretically. That depends largely on the basal complexity of the algorithm, & I don't feel it's complex enough yet. I have several levels in mind that don't quite fit the already established pattern, once I have a better pattern (of increasing complexity) it should scale better.

> I disagree that scope vs precision depends on inputs only.

I didn't say that it does.

> It has to depend in large part on internally set goals/values. Inputs don't assign values - with no values trying to understand the complexity of a dust mite is the equivalent to understanding how to solve humanity's garbage crisis. I think unless we give an internal drive/goal/value criteria - intelligence produced by us will be a) useless b) will be difficult to test since it may appear autistic to all.

That kind of loose talk kept philosophers busy for millenia. To be constructive you need to work bottom-up.

Report abusive comment
Posted by Boris Kazachenko, Aug 6, 2011 11:00 AM
>Completion is when the algorithm can self-improve

you know I am seeing it very often among AGI thinkers - what I think is a conflation of two independent problems. It's hard enough to build intelligence, but to merge it with even harder problem if building the kind of intelligence that improves itself is I think an indication of not understanding the problem in the first place.

>That kind of loose talk kept philosophers busy for millennia. To be constructive you need to work bottom-up.

that's not a serious answer. I don't know anything about what kept philosophers busy - I don't study philosophy, but in building AGI I did run into a problem of a need for an ability to shift focus. Selecting for predictivness is not a sufficient criteria because watching a movie a second time increases predictiveness for the next 2 hours - maximizing predictivness forces us to keep watching the movie - but it takes something else to shift focus away. Everything you said thus far makes me think that you don't recognize this as a problem. I think that's something you will have to deal with when you actually try to run your program if you ever get to that point. I think you are falling for the same fallacy as Jeff Hawkins does in his book that is to assume that intelligence can be passive i.e. input dictates output, when intelligence has to be active and even proactive.


Report abusive comment
Posted by Andrey Panin, Aug 7, 2011 6:06 AM
I already answered re shifting: predictive power = scope * precision, you need to increase both. And beyond that, I explained in the knol why you need discontinuous shifting, 3rd section, 4th paragraph:

"The next level of selection by feedback results in a preference for exploration over confirmation: we skip over too predictable sources / locations, thereby *reducing* match of new inputs to older templates. This doesn’t select for either proximity or contrast, & seems to contradict my premise. However, exploration should increase *projected* correspondence, which is a higher-level criterion than concurrently reduced *confirmed* correspondence."

Every issue you raised is addressed in the knol. You simply don't seem to care for theoretical understanding, & I don't care for tinkering. Too bad.

Report abusive comment
Posted by Boris Kazachenko, last edited Aug 7, 2011 10:26 AM
> you know I am seeing it very often among AGI thinkers - what I think is a conflation of two independent problems. It's hard enough to build intelligence, but to merge it with even harder problem if building the kind of intelligence that improves itself is I think an indication of not understanding the problem in the first place.

It's not a different problem, - learning (increasing predictive correspondence) *is* self-improvement. And there should be no hard distinction between learning data & learning code, - both are driven by the same criterion, or fitness function. But it is a common fallacy to see intelligence as a fixed object.


Todor Arnaudov:

Events in programming are all the way through the hierarchy...

Hi, Boris, I happened to check you out in the right moment, a few notes in a domain I guess I'm competent:

>Besides, the events are assumed to be high-level concepts, preprocessed by human cognition. That’s the type of data
>programmers usually deal with, but general intelligence should not depend on preprocessing.

I beg to differ - not true for the "real programmers".

Events in programming start from hardware interrupts and binary flags (set/reset), it's abstraction of "change", "difference" and "selection" (message to this specific receiver who recognizes the event).

Also, in hardware and software does exist a deep hierarchy of abstraction, starting from "sensory inputs" (IC electrical inputs), flat and hierarchical blocks inside the IC, going to inter-ICs, multi levels of redirection in OS and the software.

IMHO high level view on events belongs more likely to people from humanities, who have hard time thinking and remembering all those specific details.


>My approach, on the other hand, is to search for patterns within environmental input flow. I don’t even
>make a distinction
>between input patterns & problem-solving algorithms, -

OK.

> that’s an artifact of the way we design computers, to run hand-coded programs for specific tasks. It
>makes no sense in the
>context of continuous evolution of general intelligence, which should be recapitulated in AGI design.

I'm not sure the distinction comes from this per se, computers evolve to be ever more general tools, to run ever more general code (solve more general problems in one monolithic system) with ever less efforts for coding and ever more reuse and speed ups - from assembler, to functions, more complex built-in CPU instructions, ever higher level languages, libraries, OOP, OSes, hardware abstractions etc.

I think the issue comes from the way most computer users think, they don't realize how brain starts crunching data and the basic principles of GI. I guess this is similar to the way some AGI-haters say "computers can't understand language" or "computer can't never think", and they explain it by claiming: "computers do exactly what we tell them to do" ==> they, users, are incompetent and can't understand language, they can't make computers think.

Last edited Jul 13, 2011 6:24 PM
Report abusive comment
Hi Todor,

I was talking about "events" in BI (probabilistic calculus). They're assumed to be discrete, rather than artificially quantized analog sensory input flow. They call them "hypotheses" & "confirmations", does it sound like a low-level mindset to you? Re programming, I also meant high-level (symbolic) input data, rather than the code / hardware that manipulates it. All of our conscious experience is "high-level", more so for "people in humanities", but the programmers are not immune. Computers are general-purpose, but the programs aren't, except for their hardware interface. An example of the "artifact" I was talking about is separation between data cache & instruction cache on ALU level, I have no use for that.

Report abusive comment
Posted by Boris Kazachenko, last edited Jul 5, 2011 4:44 PM
Thanks for the reply!

About BI - sure, I also taught students that starting from high level is not going to scale, like Prolog, Cyc, expert systems, frame-based cognitive architectures etc.

>Re programming, I also meant high-level (symbolic) input data, rather than the code / hardware that manipulates it.

I tried to point the understanding of the concept of "event" itself. For a banking software or a researcher who's bad in programming, "event" might be "being sunny or rainy". Real programmers and engineers who do DSP, computer vision, ML/RL or just low level coding have a better "physical" idea.

>Re programming, I also meant high-level (symbolic) input data, rather than the code / hardware that manipulates it.

OK, but input data for some kind of software is as symbolic as quantized sensory matrix.


> All of our
>conscious experience is "high-level", more so for "people in humanities", but the programmers are not immune. Computers are
>general-purpose, but the programs aren't, except for their hardware interface. An example of the "artifact" I was talking
>about is separation between data cache & instruction cache on ALU level, I have no use for that.

This seems to me rather a detail and an optimization (paralellization): two independent buses (for speed, physical limitations), also instruciton/data division is for simplicity and speed (preferrably sequential reading for part of the input); for cache - data changes more rapidly than instructions, because self-modifying machine code is usually forbidden today etc.

It's a specialization, but it's transparent to target work, and there always will be some sort of physical or practical basis which will force some design decisions at the low level of implementation.

As for the seeing division instr./data artificial/practical - I agree that it's a POV/frame what to be interpreted as what, even for the "stupid" algorithms data is a part of the running algorithm (actual sequence and causal forces changing the system).


>Computers are general-purpose, but the programs aren't, except for their hardware interface.

Isn't it a matter of scope and complexity. Bigger "programs" such as OSes are big deal general purpose, and complex application software gets more general during development. Sure, not AGI, but as the number of functions grow, they're generalized as long as their parameters and structure start to repeat. And after all, generalization starts from comparing specific samples, programming complex system generates samples to be generalized, it's how functions, structured programming, OOP and Design Patterns originated.

As of philosophers/social science types and programmers - you give more favour to the former, but typical philosophers have no chance formalizing intelligence themselves as well, because they don't understand, don't care ("it's beneath them") or don't have skills in programming, i.e. low level data representation and processing. IMHO a lot of philosophy consists of simple, obvious low complexity concepts - higher generality doesn't strictly mean complex or hard to derive. However these simple things are masked with big words.

Long ago I tried to explain to a philosopher, that computers just seem "dull" to him, because he sees them as "1 or 0", but actually he doesn't understand them. Well, let he say computers are doing exactly what he tell them after defining and understanding the dynamics of 10 or 100 billions of dumb "1s and 0s", he's pushing buttons, billions of bits are updating. Programming seems "dull" to them as "generalist-type", because it's too hard for them.

Bottom line from me on this point is that there's more than being specialist/generalist or depth of hierarchy, it's also the resolution and scale of processing you do over that hierarchy.

Report abusive comment
Posted by Todor Arnaudov, last edited Jul 12, 2011 1:52 PM
> I tried to point the understanding of the concept of "event" itself.

That "concept" is meaningless by itself.

> As for the seeing division instr./data artificial/practical - I agree that it's a POV/frame what to be interpreted as what,

Right, & cache division is just one example of such "hard" separation, in programmer's mind as well as in computer architecture. It won't be an "optimization" if your code is incrementally derived from your data.

>> Computers are general-purpose, but the programs aren't, except for their hardware interface.
> Isn't it a matter of scope and complexity...

No, it's not simple scaling, the higher levels are mostly application-specific handles. Look, if you want to talk superficially related computerese, may I suggest AGI list?

> Bottom line from me on this point is that there's more than being specialist/generalist or depth of hierarchy, it's also the resolution and scale of processing you do over that hierarchy.

That's as trivial as your earlier talk of "raw power".

I don't favor philosophers, I said many times that philosophy is the most dysfunctional discipline next to theology. I just don't talk about them as much, because they don't try to build an AGI.

But you do sound like a philosopher yourself, talking about anything *but* the actual subject matter.
Care to discuss something potentially constructive?

Report abusive comment
Posted by Boris Kazachenko, last edited Jul 20, 2011 1:27 PM
>But you do sound like a philosopher yourself, talking about anything *but* the actual subject matter.

OK... I'm not even warming up now, "pre-warming".

>Care to discuss something potentially constructive?

Sure...

...

>No, it's not simple scaling, there's a ton of application-specific biases mixed-in on higher levels.

Nobody claims this is scaling the way you do it, it's not AGI. I claim that software engineers know about scaling, even if they're spoiling it for practical reasons.

>Look, if you want to talk superficially related computerese, go to AGI list.

I don't want, I wanted to share few thoughts on these computer related topics.


>That's as trivial as "raw power" from your earlier attempts.

A bit reworded bottom line: Real programmers shouldn't be underestimated, they have "raw power" and an idea of scaling.

>I don't favor philosophers, I said many times that philosophy is the most dysfunctional discipline

I know, but you mention them as supposed to possess a more appropriate mindset, while programmers are completely lost in your opinion.

Report abusive comment
Posted by Todor Arnaudov, last edited Jul 13, 2011 3:16 PM
> Real programmers shouldn't be underestimated, they have "raw power" and an idea of scaling.

Show me.

> I know, but you mention them as supposed to possess a more appropriate mindset,

That was re "real" philosophers, not the kind you would hear about. Except for myself.
If I grew up in the west, I'd probably start by studying philosophy (esp. philosophy of science), but drop it after realizing that cognition must be defined at sensory level. Cognitive process is the only legitimate subject for philosophy, the fact that "philosophers" aren't working on it is a different matter.



Todor Arnaudov:

Task: Comparing a single-integer input to a fixed-length continuous sequence of older inputs

Hi Boris, I'm loading my gun for a new shoot. :)

B>If you want to get constructive (meaningful), try to formalize comparing a single-integer input to a fixed-length continuous sequence of older inputs, & then form its prediction over the next sequence of the same length & direction.

Actually I got an idea immediately, even shared a bit with my students, but it seemed too simple. However now I believe it should be simple, there shouldn't be rocket science in a few numbers and all patterns should be derivable from the mere numbers and their relations, such as start value, differences, changes.

So, my first guess is that this seems similar to DSP and might be related to delta coding and linear prediction. For a start I thought only of subtraction difference, it's effective for low ratio smooth changes. However I see now you've added more clues in the article and also division and logarithm are more appropriate for high ratio and very high ratio changes.

I see also that applying different kinds of comparison is needed in order to be able to *select* the right one if some matched, some mismatched; like in the following example with the shortest possible sequences:

First sequence:

[5 6]
Pattern:

Length = 1
Start = 5
Add Diff = 1
Ratio Diff = 1,2
Direction Diff = 1 (+)

New number:
[5 6] 7

Compare the difference to the last number of sequence, and the match to the pattern:

Add Diff = 1 (match 1)
Ratio Diff = 7/6 (match 0.935)
Direction Diff = 0 (+) (match 1)

A new sequence, assuming algorithm doesn't care about the mismatch of the Start value (coordinates).

[50 51]

Add Diff = 1 (m 1)
Ratio Diff = 51/50 ( m 0.85)
Dir Diff = 0 (m 1)

What matches better is the Add Diff, the algorithm should ignore ratio mismatch and will predict 52.

However:

[100 110]

Add Diff = 10 (m 0.1)
Ratio Diff = 1,1 ( m 0,92)
Dir Diff = 0 (m 1)

Now Add Diff match is very low, but Ratio Diff match is almost identical to the match between the pattern and the new number, therefore: 7/6*110 = 128,33 = (int) 128.

Last edited Jul 9, 2010 5:49 AM
Huh? This is not even wrong, - there's no algorithm, just ad hock examples. I don't *ever* want *any* examples, - they pollute the mind. Use algebraic variables, not the actual numbers.

Actually, the examples *are* wrong. Forget about higher orders of comparison, DSP, & whatever other "hammers" you happen to know about, think in terms of the purpose. You keep talking about the differences, but the purpose is to project *match*, as a distinct variable. You don't predict the next input, every past input is already a prediction. You need to quantify accuracy (match) of that prediction for the next n comparisons, based on the past n comparisons. Hint: *projecting* a match means adjusting it for the "cost" of search, & for competing projection of accumulated difference. If you figure this out, it'll be a first step down a long road.

Report abusive comment
Posted by Boris Kazachenko, last edited Jul 8, 2010 2:05 PM
OK, it's wrong and I've misinterpreted it, but "next sequence" in the question could mean also a new different one, not only a continuation of the past.

My example was about: [a1, ..., an, x] -?-> [b1, ... bn, y=?], [c1, ... , cn, z=?], ...
While now I guess it should be: [a1, a2, ... an x b1 b2 ... bn] ==> (a1 .. an) x -?-> (b1 ... bn),

A correlation rather than an extrapolation. How the past input was predictive to the following input that *really happened*, rather than about a prediction of the next value *before it happens*.
Prediction before a value happens should come later, using justified selected predictive patterns with quantified match. I'll think about it.


Maximizing predictive-correspondence which maximizes reward

Hi Boris,

A guess... (Or a new shot in the dark :) )

I think that the mind favors maximizing predictive-correspondence which maximizes reward, I suppose this is related to what you and psychologists call hierarchy of needs. Maximizing predictive-correspondence/compression can be assumed as a form of reward for itself, as well as misses/errors – a “punishment”, but there must also be lower “root” rewards to generate initial behavior and to drive initial focus on selected stimuli.

>past patterns are decreasingly predictive with the distance / delay from expected inputs,
>Recent inputs are relatively more predictive than the old ones by the virtue of their
> proximity to future inputs. Thus, proximity should determine the order of search within
> a level of generality.

Is it always so? I suspect it may be not always the case. It is possible to have delayed patterns, where activity “now” is dependent on changes that happened long ago. The fresh input buffers are cheapest to check quickly, even if they're not the the most relevant, and if the input buffers are too short, mind has no choice, but searching for patterns there; a machine with longer buffer may learn much faster. There could be a “cache”/”stack” for old inputs which are expected to be predictive with a delay.

Also, such a correlation between recent inputs and close future inputs is apparent when the patterns are inertial/slow-changing/low frequency ones and the activity passes through adjacent coordinates, like in the HTM basic vision demo. Many (or most) of the input patterns do, but I guess - not all.

Also rewarding old inputs can be much more predictive than new unrewarding ones, because mind searches how to maximize their predictive power to future inputs, while it may ignore and miss to evaluate recent inputs which are expected to be unrewarding (and nonthreatening), unless they are attached to rewarding ones making them rewarding as well (or such to avoid punishment).

Overall, I suspect that a reward function(s) need to be added to predictive correspondence, and proximity and recentness may need to be more abstract.

Regards
Todor

Last edited May 22, 2010 8:06 PM
Report abusive comment
> A guess... (Or a new shot in the dark :) )

At least you're shooting at the right target :).

> I think that the mind favors maximizing predictive-correspondence which maximizes reward, I suppose this is related to what you and psychologists call hierarchy of needs.

My hierarchy is a sequential development of generalized means, which are then conditioned to become needs/wants. Basic cognition is driven by a very low-level inherited algorithm, without it this development can't even start.

> Maximizing predictive-correspondence/compression can be assumed as a form of reward for itself, as well as misses/errors – a “punishment”, but there must also be lower “root” rewards to generate initial behavior and to drive initial focus on selected stimuli.

Initial behavior is instinctive, & curiousity is one of the most basic: the knowledge instinct: http://en.wikipedia.org/wiki/Leonid_Perlovsky
Initial cognition is driven by a low-level design of neocortex (most likely minicolumn: http://brain.oxfordjournals.org/cgi/content/full/125/5/935 ), it doesn't need any extra-cortical "rewards".

> It is possible to have delayed patterns, where activity “now” is dependent on changes that happened long ago. The fresh input buffers are cheapest to check quickly, even if they're not the the most relevant, and if the input buffers are too short, mind has no choice, but searching for patterns there; a machine with longer buffer may learn much faster. There could be a “cache”/”stack” for old inputs which are expected to be predictive with a delay.

Yeah, that's what I call "higher levels of generalization". Those *are* older inputs, only compressed, & selected accordingly.

New Edit: my wrong, that's a good idea, though not well justified. See on the first prize in the knol.

> Also rewarding old inputs can be much more predictive than new unrewarding ones, because mind searches how to maximize their predictive power to future inputs, while it may ignore and miss to evaluate recent inputs which are expected to be unrewarding (and nonthreatening), unless they are attached to rewarding ones making them rewarding as well (or such to avoid punishment).

You mean that we can make inputs more predictive by reproducing them? That means going way back to a lower stage of meta-evolution :).

> Overall, I suspect that a reward function(s) need to be added to predictive correspondence,

I'd suggest that you forget about subcortical nonsense, it's part of the problem, not part of the solution.

> and proximity and recentness may need to be more abstract.

That's already explained in the knol. It's true that a mind will skip over too predictable inputs, even if not driven by non-cognitive rewards. It's a form of novelty seeking that is not maximizing proximity, contrast, or even actual match. I didn't explain that in the knol. If you can define the criterion that's maximized in such "exploration mode", that would warrant a consolation prize :).

Boris

Report abusive comment
Posted by Boris Kazachenko, last edited Jun 29, 2010 5:11 AM
>At least you're shooting at the right target :)

Finally! ;)

>> Maximizing predictive-correspondence/compression can be assumed as a form of reward for
>>itself, as well as misses/errors – a “punishment”, but there must also be lower “root” rewards
>>to generate initial behavior and to drive initial focus on selected stimuli.

>Initial behavior is instinctive, & curiousity is one of the most basic: the knowledge instinct:
>http://en.wikipedia.org/wiki/Leonid_Perlovsky
>Initial cognition is driven by a low-level design of neocortex (most likely minicolumn:
>http://brain.oxfordjournals.org/cgi/content/full/125/5/935 ), it doesn't need any extra-cortical
>"rewards".

Thanks for the links! I've missed Leonid and yes, I do have to check out the "raw scientific input" about the columns...

>>Also rewarding old inputs can be much more predictive than new unrewarding ones,
>>because mind searches how
>>to maximize their predictive power to future inputs, while it may ignore and miss to evaluate
>>recent inputs which are expected to be unrewarding (and nonthreatening), unless they are
>>attached to rewarding ones making them rewarding as well (or such to avoid punishment).
>>Overall, I suspect that a reward function(s) need to be added to predictive correspondence,

>You mean that we can make inputs more predictive by reproducing them? That means going
>way back to a lower stage of meta-evolution :).
>I'd suggest that you forget about subcortical nonsense, it's part of the problem, not part of the
>solution.

Elegantly said... :)

If I'm getting this right:

>My hierarchy is a sequential development of generalized means, which are then
>===conditioned to become needs/wants.===

Then to you is behavioral/conditioning part - this makes sense. I think that is probably another hierarchy (what you call hierarchy of needs), where lower brains (brainstem, amygdala, hypothalamus) are higher levels of *control* (basic needs) than the highest level of cognitive hierarchy, and the direction is evolutionary backwards. At least this is true right when you "switch on" a human.

However I do believe going back in meta-evolution makes sense, because subcortical regions are more primitive. Actually inputs do get sort of more predictable (or at least subject's behavior gets more predictable, so pleasing patterns are generally more predictive than not pleasing ones).

This is how love and addictions self-feed - by reproduction of recorded behaviors that led to a pleasure.


>> It is possible to have delayed patterns, where activity “now” is dependent on changes that
>>happened long ago. The fresh input buffers are cheapest to check quickly, even if they're not
>>the the most relevant, and if the input buffers are too short, mind has no choice, but searching
>>for patterns there; a machine with longer buffer may learn much faster. There could be a
>>“cache”/”stack” for old inputs which are expected to be predictive with a delay.

>Yeah, that's what I call "higher levels of generalization".
>Those *are* older inputs, only compressed, & >selected accordingly.

OK. :) At this point my terminology is “higher level virtual universes”, “higher level virtual simulators of virtual universes”, “higher level of control”.

The laws of physics of the higher level universes are built by sequences and sets of lower level laws, which on their own have their laws of physics and sub-universes. Laws of physics and virtual universes are predictive patterns (systems of patterns), extracted from sensory input and used to predict. On the lowest level, laws are not compressed, this is "the reality"::

- in real Universe, you have to simulate all in order to predict and have exact representation of the future at Universe meaningful resolution (Plank's constants etc.)
- in thinking machine or human mind this is the raw sensory input that causes cognition to start

In order to interact/interface with the lowest level universe for the system, higher level must decompress its representations throughout the hierarchy, and each level down adds details, making the picture increasingly sharper.

>> and proximity and recentness may need to be more abstract.
>That's already explained in the knol.

Just going up in the hierarchy?


>It's true that a mind will skip over too predictable inputs, even if not driven by non-cognitive
>rewards. It's a form of novelty seeking that is not maximizing proximity, contrast, or even actual
>match. I didn't explain that in the knol. If you can define the criterion that's maximized in such
>"exploration mode", that would warrant a consolation prize :).

Nice... :)

My first intuitive guess is predictive range, compression ratio; I think it's related to minimum message length/Kolmogorov's complexity.

I'm not sure if these concepts are the answer to your question, but they sound interesting to me anyway. Sounds like “Predictability ...

Report abusive comment
Posted by Todor Arnaudov, last edited May 12, 2010 4:04 PM
Hmm, my comment seemed too long, coninues:

>It's true that a mind will skip over too predictable inputs, even if not driven by non-cognitive
>rewards. It's a form of novelty seeking that is not maximizing proximity, contrast, or even actual
>match. I didn't explain that in the knol. If you can define the criterion that's maximized in such
>"exploration mode", that would warrant a consolation prize :).

Nice... :)

My first intuitive guess is predictive range, compression ratio; I think it's related to minimum message length/Kolmogorov's complexity.

I'm not sure if these concepts are the answer to your question, but they sound interesting to me anyway. Sounds like “Predictability Analysis/Calculus”. :)

- For how long in the future/in space predictions are expected to match real input, based on the pattern and how much input data are enough to predict the whole future input, generated by the pattern. This is particularly apparent for simple patterns that are expected to take a lot of time, like speaking aloud 1, 2, 3, ..., 1 million. :) Generally, if you know the end from the beginning, you don't need to keep attention on the process.

- Predictability range and predictability precision of the new input, based on the recent/immediate or local input from a pattern, or more generally - how parts from an input assist in prediction/compression of other parts of the input. If going meta – how parts from a pattern assist in prediction of other parts of the pattern itself.

I noticed this in the past in a section from my writings with speculations about interestingness in pictures, e.g. generally a photograph would qualify this photo http://eim.hit.bg/3/25/tee1.jpg as boring, while the next one - (more) interesting: http://eim.hit.bg/3/25/kalof94.jpg
Interestingness is subjective, but this is true at least for the measure below:

The first photo can be drawn by a portion from it, extended with a simple cycle with instructions how to stretch and copy in perspective (implying mind does this and stores images this way - compressed and doing transformations and operations). The second one can't be compressed that way (not so simple), also there are more meaningful recognizable objects and mind needs to engage more. This is what Interestingness is all about - engaging mind to watch and try to predict what would come next. There are other aesthetics reasons for the interestingness as well - emotional, “organic” appearance/smoothness, dynamics - expected possible change in pictures with animate objects; however, this is another story.

- Function of predictability in time/space. How prediction precision changes throughout the accumulation of more data. If precision stops rising, rises too slow or reaches to very high levels, the watch may stop – this is a saturation of the function of predictability through time. If I try to use your terms (hoping correctly) – if it's not possible to discover increasingly predictive short-cuts for a particular pattern anymore, it may be skipped over. This rule skips noise, as well.

It is possible the function of predictability to rise in a moment, e.g. seeing a flat blue banner.

Also, for a level of hierarchy, when predictability saturates, that is when a level can predict the future with a precision over a threshold, the hierarchy may grow and start searching for more complex patterns (in my terms - construct higher level virtual universes/simulators of universes).

Right, hierarchy may grow and probably should try to grow all the time, but the upper level would not be reliable until the base level stabilizes.


Todor

Report abusive comment
Posted by Todor Arnaudov, last edited May 12, 2010 4:08 PM
> Thanks for the links! I've missed Leonid and yes, I do have to check out the "raw scientific input" about the columns...
I like Perlovsky’s explanation of the “knowledge instinct”, but his “Dynamic Logic” doesn’t seem to be very deep.

> Then to you is behavioral/conditioning part - this makes sense. I think that is probably another hierarchy (what you call hierarchy of needs), where lower brains (brainstem, amygdala, hypothalamus) are higher levels of *control* (basic needs) than the highest level of cognitive hierarchy, and the direction is evolutionary backwards.

There’s no “control”, - computer analogies are misleading. Analogical thinking is a blunt instrument, try to avoid it. Higher motives are the ones that ultimately win over, not the ones that develop earlier. All brain areas have an inherited structure that determines their initial (instinctive) operation. Brain stem, amygdala, hypothalamus develop earlier, & their instincts dominate at first. Basic curiosity is a “cortical instinct”, likely driven by the structure of minicolumn, & neocortex is the last to fully develop. But that’s a genetically determined part. Postnatally, motivation develops by competitive conditioning of inherited motives & acquired value-loaded patterns in all of those areas. Conditioning is reinforcement of coincident (instrumental) & suppression of counter-incident (interfering) motives & stimuli patterns by all other motives. These patterns become acquired motives, but they're *not* lower than the original ones. Higher or lower is matter of strength, not of origin. Cortical cognition discovers more general patterns, that get relatively stronger because they stay instrumental longer. And curiosity itself is instrumental for discovery of all these patterns, so it ultimately becomes the top value & suppresses all others. You don't need any subcortical drives even to start, unless you have human physiology to take care of. But basic curiosity (I don't know the full "structure" of it yet) is only a start too. Introspective cognition derives higher orders of correspondence, developing things like mathematical curiosity.

> OK. :) At this point my terminology is “higher level virtual universes”, “higher level virtual simulators of virtual universes”, “higher level of control”.

I don't like your terminology. It's "fluffy": redundant, pretentious, fuzzy & misleading. That's a bad taste in science, as distinct from art. Artist thrives on analogical confusion, Scientist abhors it & craves analytical clarity. Make your choice. Your interpretations “sound” wrong on many levels, but you don’t really define your terms, to the extent that they're different from mine. If you think they’re more expressive, or your conclusions are different from mine, please explain how. Try to think more & talk less, you know, review & rewrite your reply for a few days before posting it :).

> In order to interact/interface with the lowest level universe for the system, higher level must decompress its representations throughout the hierarchy, and each level down adds details, making the picture increasingly sharper.

I don't know if there's any need for decompression, higher levels may only adjust focus (input span & resolution) for lower levels. Patterns of different scope / generality must be kept separate to avoid "paradoxes" :).

>>> and proximity and recentness may need to be more abstract. >>That's already explained in the knol. >Just going up in the hierarchy?

Up *&* down, that's what the hierarchy is all about. But neither direction is fully explained in the knol (even to the extent that I understand them), so use your imagination.

> My first intuitive guess is predictive range, compression ratio; I think it's related to minimum message length/Kolmogorov's complexity...

It all sounds vaguely relevant, but defining a criterion means quantifying it. It’s not match so it’s not an actual compression, or even future compression. “Expected”, “predicted”, "partial" - how do you derive those things from pixel-level inputs? Because if you can't do it there, you can't do it anywhere, - combinatorial explosion gets you. I’ve shown how to quantify a basic match, & that still stands as an initial criterion. How do you derive from it a higher-order criterion that drives exploration? You gave a bunch of higher-level examples, but I am not even going to bother with them, that's not where I operate.
If you want to get constructive (meaningful), try to formalize comparing a single-integer input to a fixed-length continuous sequence of older inputs, & then form its prediction over the next sequence of the same length & direction.

Report abusive comment
Posted by Boris Kazachenko, last edited May 13, 2010 7:22 PM
I realized an important difference - a different POV to a mind. My theory was not inspired by brains, minicolumn hypothesis or so, it was a sketch/direction/aimed at a unifying theory of mind and systems evolution in Universe.

Attempts to fit it exactly to brain regions causes a mess - there are overlaps and similarities to HTM, minicolulmn hypothesis, your theory, brains, but mine is different, digital, sketchy and was not as precisely defined, I see there were implied things which were not clearly specified and separated.

My speculations were based on observations on causality/determinism (causal interdependency) and tendency of evolving systems at prediction, repetitive and predictive behavior with ever higher precision, resolution and range. "Control" in my writings was meaningful and it's system's (module's) capability to predict and cause the future of what it controls with certain probability/precision, where control is formalized as a write to a memory, i.e. making certain target changes in an output environment.

Mind is a compound/complex "control unit" itself, aiming at maximizing its capabilities to predict (imagine) and cause, where Universe is the ultimate control unit, "predicting" and causing everything at the maximum possible resolution, including mind itself, which is a "virtual sub-universe".

>There’s no “control”, - computer analogies are misleading.

My "mind sketch" was digital.

>(...) All brain areas have an inherited structure that determines their initial (instinctive) operation.
>Brain stem, amygdala, hypothalamus develop earlier, & their instincts dominate at first. Basic
curiosity is a “cortical instinct”, likely driven by the
>structure of minicolumn, & neocortex is the last to fully develop.
> (...) But basic curiosity (I don't know the full "structure" of it yet) is only a start too. Introspective
>cognition derives higher orders of correspondence, developing things like mathematical curiosity. "

Thanks, I see.

>I don't like your terminology. It's "fluffy": redundant, pretentious, fuzzy & misleading.
>That's a bad taste in science, as distinct from art. Artist thrives
>on analogical confusion, Scientist abhors it & craves analytical clarity. Make your choice.
>Your interpretations “sound” wrong on many levels, but you don’t really define your terms, to the extent that they're different from mine. If you
>think they’re more expressive, or your conclusions are different from mine, please explain how.

I'd make both choices. :)

Sometimes your definitions remind my observations and my interpretations are related to my theory and it makes sense *there*. Right - this is a mess.

Match, comparison, difference between predicted and expected, compression, a basic algorithm that learns other algorithms and data and collects them, complexity grow and sort of algorithmic complexity (but re-invented) etc. are some terms and topics from my writings. I'm not ready with a solid compressed explanations yet, though.

>I don't know if there's any need for decompression, higher levels may only adjust focus (input span &
>resolution) for lower levels. Patterns of different scope / generality must be kept separate to avoid
>"paradoxes" :).

Does adjusting focus mean:
- selecting/allowing comparison with more recorded samples is sort of widening of span - more general comparison.
- lowering the resolution allows recognition of fuzzy/pixelized images and results at a higher match ratio - a more general comparison.

I mean this: a word, a concept can be recorded and operated with a few bits from highest levels, but this is just a label, it makes sense in a high level virtual universe (imagination), but it needs much more raw data in order to be derived from a low level and to be expressed back there.

>Up *&* down, that's what the hierarchy is all about. But neither direction is fully explained in the
>knol (even to the extent that I understand them), so
>use your imagination.

OK

>It all sounds vaguely relevant, but defining a criterion means quantifying it.
>It’s not match so it’s not an actual compression, or even future compression.
>“Expected”, “predicted”, "partial" - how do you derive those things from pixel-level inputs?

I perfectly understand that it should start from the lowest level and the mechanics must be precisely defined. That is what I'm supposed to do "when I manage to concentrate"...

>I’ve shown how to quantify a basic match, & that still stands as an initial criterion.
>How do you derive from it a higher order criterion that drives exploration?
>(...)
>try to formalize comparing a single-integer input to a fixed-length continuous sequence of older
>inputs, & then form its prediction over the next sequence of the same length & direction.

Thanks for the task! I may have a break now and will be back later.

Todor

Report abusive comment
Posted by Todor Arnaudov, last edited May 17, 2010 6:54 AM
> I realized an important difference - a different POV to a mind. My theory was not inspired by brains, minicolumn hypothesis or so,

Me neither, I am a generalist.

> it was a sketch/direction/aimed at a unifying theory of mind and systems evolution in Universe. My "mind sketch" was digital.

I suspect it was an attempt to project your computer experience into areas where it doesn't belong. Very typical for AI tinkerers, - lots of ambition, but no clue.

> Attempts to fit it exactly to brain regions causes a mess - there are overlaps and similarities to HTM, minicolulmn hypothesis, your theory, brains, but mine is different, digital, sketchy and was not as precisely defined

You don't really understand things you can't define. Your attachment to ill-formed assumptions of your youth, as well as constant self-promotion, is probably a sign of insecurity.

> I'd make both choices. :)
That's not making a choice. You'll do neither well, & even “well” is useless here, only the-best-in-the-world will do.

> Does adjusting focus mean:
- selecting/allowing comparison with more recorded samples is sort of widening of span - more general comparison.
- lowering the resolution allows recognition of fuzzy/pixelized images and results at a higher match ratio - a more general comparison.

Neither, both work "upward", focusing is downward. Guess again.

> I mean this: a word, a concept can be recorded and operated with a few bits from highest levels, but this is just a label, it makes sense in a high level virtual universe (imagination), but it needs much more raw data in order to be derived from a low level and to be expressed back there.

Raw data is what you start with. It’s lost during selective elevation & you won’t regain it by decompression. Patterns on every level are search range –defined. “Expressing” high-level patterns on lower levels will only create confusion about their “true” range (& you’re confused enough:)). There’s no need for it anyway, higher levels “expectations” are compared to lower-level “experience” when the latter is selectively elevated, not vice-versa.

> I perfectly understand that it should start from the lowest level and the mechanics must be precisely defined. That is what I'm supposed to do "when I manage to concentrate"...

You won’t, until & unless you change lifestyle. You need a boring life.

Report abusive comment
Posted by Boris Kazachenko, last edited May 17, 2010 9:09 PM
>I suspect it was an attempt to project your computer experience into areas where it doesn't
>belong. Very typical for AI tinkerers, - lots of ambition, but no clue.

Don't forget imagination and creativity - my kingdom :) - "universal simulators of virtual universes" are engines of imagination. Indeed I think art gives many clues about intelligence and the big picture of mind.

>You don't really understand things you can't define.
>Your attachment to ill-formed assumptions of your youth,
>as well as constant self-promotion, is probably a sign of insecurity.

Insecure - I am, this is correct. I need to make a breakthrough in order to stabilize life, income and start feeling more secure: a successful novel, a beautiful film with touching performance or so, and it's frustrating to balance time, wait and be unable to rise the resources needed.

Self-promotion - I don't have a real personal PR, an agent or so, I'm not acknowledged yet. Must attract followers and make contacts somehow, I want to start-up a business out of my art after all. I'd prefer somebody else to promote me.

Youth assumptions - I want to focus, understand and clear them out, before throwing them away. I'm attached, because I haven't finished with this.

See you in the next iteration!

T

Report abusive comment
Posted by Todor Arnaudov, last edited May 18, 2010 9:03 AM
Art = fluff. You love fluff, & crave attention, the rest is just an excuse.

Trying to focus on "understanding" the assumptions made when you understood a lot less then you do now is pathetic. You need to understand the subject matter - cognitive algorithm.

Report abusive comment
Posted by Boris Kazachenko, last edited May 18, 2010 4:18 PM
I appreciate your badass wise sentences, but I like both art & science and wanted and want to understand art as a cognitive process as well, it's a part of the same machinery. Re-understanding operation is in progress, new understanding is not in vain, this won't take much; and one of my immediate next AGI tasks is to manage to think and write about cognition in your terms - will teach your stuff and your comments to my students on Friday.

BTW, I believe a little bit of promotion may help even such a detached person like you. You agree that collaboration is the best "cognitive accelerator" and I'm sure at least some of the famous and smart AGI people such as Schmidhuber would spend some time with your articles and may make others consider them.

All needed is to let him/them know about you somehow.

Report abusive comment
Posted by Todor Arnaudov, last edited May 22, 2010 2:33 PM
> I appreciate your badass wise sentences, but I like both art & science and wanted and want to understand art as a cognitive process as well, it's a part of the same machinery. Re-understanding operation is in progress, new understanding is not in vain,

Generalization is a reduction. Yes, everything you know is related to it, but you won't get anywhere by piling things up.

> one of my immediate next AGI tasks is to manage to think and write about cognition in your terms - will teach your stuff and your comments to my students on Friday.

Holding my breath :)

> BTW, I believe a little bit of promotion may help even such a detached person like you. You agree that collaboration is the best "cognitive accelerator" and I'm sure at least some of the famous and smart AGI people such as Schmidhuber would spend some time with
your articles and may make others consider them.

I appreciate your appreciation (& promotion), but you forgot the second best accelerator. The reason I am, IM!HO, a lightyear ahead of anyone else is that I gave up on recognition & collaboration with tinkerers+fluffers that populate the field. It's not what you got, it's how you use it. Smarts won't do any good if you lack motivation to focus on the only problem that matters. Famous people have their blinders on. They're too distracted by, & protective of, their fame to pay attention to some security bum who tells them that their lifework is a pile of irrelevant crap.
Yes, collaboration would be great, but... I despair. Anyone who knows how to punch right keywords into Google will find me (there's *nothing* else), & those who don't are likely to be more trouble than help.

Report abusive comment
Posted by Boris Kazachenko, last edited May 22, 2010 8:06 PM

How to filter out the improbable seems to me to be the key

Generation of a plethora of possible near-futures seems possible, but how to filter out the staggering majority which are improbable, or illegal in terms of the physical laws of the universe, seems complicated. Also, how to collapse possibilities that are so similar as to be essentially the same probabilistically? Then, your discussion of probability ranking the remaining possibilities makes sense.

In any case, it would be a delight to hear from you Vitya/Burya. rick at bunkerplanet dot com.


Last edited Jul 17, 2009 5:12 AM
Report abusive comment
This is a bit backwards, Rick, I propose to *discover* possibilities (patterns), not to generate them. "Generative" bias is typical for a programmer :). The patterns are formed by comparing lower-level inputs, & projected to the extent of cummulative match discovered by such comparison. This is how physical laws are discovered, & it also answers your second question. This knol has a more detailed discussion of the process, but I guess it's unbearably abstract. Intelligence is a subject everyone feels competent to discuss, because everyone has it. Yet, no one can reduce it to a formal procedure, or even formally define its purpose. I feel such reduction requires an extreme "inductive" cognitive bias, the opposite of the "deductive" bias selected for & cultivated by Computer Science & Math.

AGI

Interesting. Very nice to see more people working on Artificial General Intelligence.

I have written a few articles on self improving AI here: http://seedai.blogspot.com/2007_08_01_archive.html
In those, I agree with much of what is written here, for example "If we want to talk about improving programs, we have to define what it means to improve one's intelligence, and thus what it means to be intelligent. We want intelligent systems to be useful. Useful intelligence is, just as science, about prediction, planning and pattern recognition. These are all so intertwined as to be more or less the same thing."

You are very welcome to read and post your thoughts on my articles.


Last edited Aug 11, 2008 10:28 PM
Report abusive comment
Thanks David!
You're right, it sounds very similar on a high level, & I am sure there are many people who'd agree with the definition But I don't know of anyone who used it to derive a universal, low-level, quantitative criterion to select inputs & algorithms. The key is to start from the beginning: raw sensory inputs, & "test" their predictive value, in the process discovering more & more complex patterns. That's what scalability is all about, if you can't evaluate pixels, it'll be super-exponentially more difficult to start from more complex data. That's why I think Cyc, NLP, & high-level approaches in general are hopeless for AGI.
I am sorry, but your "Intelligence test" idea, besides being entirely hypothetical & presumably externally administered, has it exactly backwards. Just like many Algorithmic Learning approaches, you want to generate patterns & algorithms, instead of discovering them in a real world. Quite simply, we predict from experience, these patterns & algorithms will have *no* predictive value beyond mere chance, unless they're derived from the environment. Notice that the difference between patterns & algorithms is strictly in the origin: the former are discovered & the later are "invented".


Francesco Lentini:

How about semantics?

Interesting article. Have you seen my "The machine to read"?

Last edited Jan 23, 2011 5:39 PM
Report abusive comment
Thanks Francesco!
Semantics(meaning) must be learned from experience, starting from sensory inputs. What I suggest a conditionally iterative learning algorithm, & syntax here is simply a record of operations perfomed by this algorithm on a given set of inputs. Such record is necessary to maintain comparability(readability) accross inputs of various "depth" of processing. This processing is a form of compression, & recorded syntax makes it possible to decompress data.
Thanks for the pointer, I'll take a look.

Report abusive comment
Posted by Boris Kazachenko, last edited Jul 27, 2008 6:40 PM
I agree with your thesis. Well, general intelligence must be scalable, or self-improving. Nevertheless, I am not sure that meaning *must* be learned from experience. Meaning (or a certain level of meaning) would be an intrinsic property of a message, and my algorithm Semantic Browsing http://www.intellibook.net/semanticbrowsing would show really this. I collected a lot of examples (browsed texts) on my site.
Returning now to your general intelligence definition, the focal point is the criterion of improvement. Can you explain better which this criterion should be, and/or can you furnish a practical example?

Report abusive comment
Posted by Francesco Lentini, last edited Jul 27, 2008 12:45 PM
The meaning "must" be learned, either by the algorithm, or by programmer's own "learning algorithm". I am sure the later is common among people & some of it is incorporated into natural language syntax, thus becoming "an intrinsic property of a message". Other than that, you can try to build a universal ontological database (as in Cyc) & use it to locate the "meaning" of individual terms & phrases. A lot people work on "semantic search", "semantic web", NLP in general, but this is not my focus & I am ill-equiped to evaluate your algorithm.
Appreciate your interest in my "focal point". The criterion for intelligence is *predictive correspondence concentration*, or relative cumulative match of expectations to the following inputs. I've defined match on the lowest, single-variable, level. It's the same on higher levels, where inputs are multi-variable sequences. As long as you synchronize the syntax of the comparand sequences, the total match is the sum of corresponding variables' matches between the sequences. I suppose you're looking for NL-level examples, & that's where it gets extremely ambiguous. That sort of data went through a huge number of process iterations, & you have to rely on intuition to track it.
Take a look at "On Intelligence" by Jeff Hawkins, he is a lot better at high-level examples than I am.

Report abusive comment
Posted by Boris Kazachenko, last edited Jul 27, 2008 7:33 PM
I notice that http://www.intellibook.net/intellibook10/ is not working anymore, it would have been Lentini's article. I have argued elsewhere (and in vain) that any algorithm would have to be "seeded" with real world "statistics", particularly something like vision has been shown to be heavily informed about useful and usual colors and shapes, while it should not be necessary to reproduce human handicaps like the difficulty of reading mirrored text.

What I think is less understood is how "thinking" will also need its own set of "built ins", patterns and concepts and processes that would be unfair to expect an AGI to work out bit by bit. I am working on isolating these built ins, and would also like to offer a counterexample on the limits of reverse engineering input bits: imagine someone sends you the digits from pi's decimal expansion, and just to trick you out starts from an arbitrary position, lets say from the 100th onwards. It would be "intelligence" to come up with this explanation and predict the sequence ad infinitum, but what kind of IQ is required? I'd say infinite, the problem is intractable and would suggest that there is no intelligence at all "in general". Intelligence is a response to a constrained environment, it is about straight lines and circles and a few "primary colors" and the tendency of things too change at manageable rates and people having limited emotional states etc. Working with bits supposedly coming from an unconstrained/unknown environment is a recipe for failure methinks.

Report abusive comment
Posted by Anastasios Tsiolakidis, last edited Jan 21, 2011 3:32 PM
Right, any intelligence would be useless in an effectively random environment. But our real environment is plenty constrained already, first by entropy growth, then by evolution, now by technology. Constraining it even further is piece of cake, - all you have to do is slow down time. I think you’re looking for easy problems because you can’t deal with the hard one, - scalable pattern discovery in an environment that our own intelligence handles easily.

Report abusive comment
Posted by Boris Kazachenko, last edited Jan 21, 2011 5:12 PM
Let's just say that I favor problems where "environmental statistics" are plenty or even complete(in toy problems). In addition to natural language I would single out these two problem domains: 1) language development between two agents, ie using a communication channel between 2 protoAGIs to cooperate, or more accurately to have AGI.b do "what AGI.a says, not what it does". This also implies an independent "observation channel". What does it take for AGI.b to turn left when receiving the message "l", starting from blank slates, tabula rasa? Unintentionally you may have received insight into one of zoology's sad stories, why intelligent animals are so bloody hierarchical!

2) the "embedded scientist", getting a protoAGI to predict/reverse engineer its environment while fully exposed to it, thus working around the problem "the observer changes the observation" and perhaps having to "fight for its life" as well. This needs a simulation of a different kind than your average game engine, probably a cellular automaton implementation.The real shoulders of giants for human intelligence is not so much Euclid and Einstein but the biological heritage which enables us to stay alive long enough as individuals and civilizations to slowly unravel the mystery of the world, it would be a miracle if an embedded intelligence in the Game of Life achieved that state where it can just wait and formulate algorithms. A big spanner in the works remains the "unsolved" problem of society and synergy, we have found a way it seems to benefit from millions of people who are mutually clueless, meaning they have different areas of expertise. Obviously we are not perfect at this, we may have failed to fully integrate the genius of, say, Tesla and Jesus, but we are better than any program I have seen. (on a tangent, I should add that it is anything but self-evident that we are benefiting from our synergy in any deeper sense, I simply refer to the build up of science and technology)

Report abusive comment
Posted by Anastasios Tsiolakidis, last edited Jan 22, 2011 7:30 AM
Anastasios,
thanks for reporting, the service has been restored! Go to www.intellibook.net and click "the machine to read".
Well, at moment this is my response to your clever chatter. Please enter in the box a text written in ANY language (Latin alphabet) from 1500 to 15000 chars in lenght, click a button and see what happens. For example, here is a RESUMEE of "Executive Attention" article by Boris.

Attention is a mechanism that focuses cognitive search.

Attention span as discussed here is not a simple duration of focus on a subject.

Rather, it’s a scope of cognitive search (level of generalized experience) that determines priorities, - selects subjects for focused ATTENTION.

Deliberate control over the focus of one's ATTENTION will be the most profound revolution yet, - it will change what we want out of life.


Precisely, this RESUMEE is based on the first 8K of the article, because you, as Guest, may not exceed this limit. Let me know if you want a registered (user payable) account.
Hi Boris, you know do more and better?

Report abusive comment
Posted by Francesco Lentini, last edited Jan 23, 2011 9:09 AM
Thanks Francesco!

Not a bad summary, but it missed the meat of the knol, which is in “Practical Implications” part. The problem with your approach is that a summary should be an introduction to an article, & a good author would write his own (the knol starts with one). If your algorithm can do better than the author, then it should be writing its own articles :).

No comments: