6/6/14

Annotation, with Dan He


Annotation for part 3 of the core article:

1st Level:

is:   initial single-variable inputs, such as pixels
ds:  differences between consecutive inputs
ms: partial matches between consecutive inputs
L:    same-sign length, initially same-d-sign Ld

dP: difference pattern, defined by Ld & containing:

Ld: same-d-sign length
i:     last input
I:    sum of inputs
D:   sum of differences between consecutive inputs
M:  sum of partial matches between consecutive inputs
Q(d): queue of differences ( Q(m) is not recorded )

feedback to the 1st level that determines 2nd derivation:

aD: average (D-last_d) in higher-level patterns that also contain mean value of Md
(total match is a subset of total comparands, so D-last_d is a crude predictor of Md)

vD = (D-last_d) - aD: evaluation for 2nd derivation within dP
if vD is positive, consecutive ds within Q(d) are compared, forming:

Q(ddP): queue of second derivation patterns:
ddP is defined by Ldd & contains:

Ldd: same-dd-sign length, within which:
d:      last difference
D:     sum of consecutive ds
Dd:   sum of dds: differences between consecutive differences
Md:  sum of mds: matches between consecutive differences

possibly present feedback to determine 3rd derivation:
aDd: average (Dd-last_dd) in higher-level patterns that also contain mean value of Mdd

if this feedback is present, second derivation pattern also contains a queue of discrete dds:
Q(dd), which is evaluated for 3rd derivation in the same way as Q(d) for 2nd derivation above

d_ddP: combined 1st & 2nd derivation pattern, defined by dP & containing:

dP(), 2D = D+Dds, 2M = M+Mds,
Q(ddP), or Q(cddP): complemented ddPs, analogous to cdPs:

cdP: complemented positive dP + negative dP  pattern, containing:

rL:    positive dL / negative dL
cL:    complemented L = positive dL + negative dL
p2D: positive 2D projected over negative dL: = D * rL + sum_Dds * rL
cD:    positive p2D + negative 2D
cM:   positive dP_M + negative dP_M + cD * amd
+dP(),-dP().


2nd Level:

initial inputs are cdPs,

feedback to 2nd level for evaluating cdP inclusion into positive or negative vP is
acM: average cM in higher-level patterns that also contain mean value of m_cdP

(basic assumption is that total lower-level match cM predicts higher-level match
m_cdP: combined match between same-type variables of consecutive cdP pair)

cdP evaluation for inclusion into vP: v_cdP = cM - acM;
consecutive cdPs with same v_cdP sign are included in corresponding
vPs: positive or negative value pattern of cdPs, defined by L_cdP & containing:

L_cdP:  number of cdPs within same - v_cdP - sign span
L_vP:    sum of cLs from included cdPs
D_vP:    sum of cDs from included cdPs
M_vP:   sum of cMs from included cdPs
Q(cdP): queue of cdPs

feedback to 2nd level for evaluating vP for comparison within Q(cdP):
aM_vP: average M_vP in higher-level patterns that also contain mean value of m_cdP

evaluation for comparison within Q(cdP):
V_cdP = M_vP - aM_vP,

if V_cdP is positive, consecutive cdPs within Q(cdP) are compared, forming VP & a
queue of dP_cdPs: same-sign Difference patterns among cdPs,
where Difference is a sum of differences from all variables of cdP:

VP: vP containing cross-compared cdPs, resulting dP_cdPs, & new variables:

L_dP_cdP: number of dP_cdPs
D_cdP:    sum of d_cdPs, where d is a sum individual-variable ds
M_cdP:   sum of m_cdPs, where m is a sum individual-variable ms
Q(dP_cdP):  queue of same-sign combined-difference patterns among cdPs

Similar to the way it’s done on the 1st level, these combined-difference patterns  are then evaluated
for internal comparison with incremental derivation, & then combined into complemented-D-sign
cdP_cdPs



To be continued.

27 comments:

Boris Kazachenko said...

Dan H said:

It's definitely better but still far from clear though. I think it's better
to use '--' instead of '-' for the
so called "common convention in English". I think it should be a long dash
instead of a short one. But
in your blog, it's the same short dash as "minus". Sometimes it's very
confusing.
I think it's better to show a table for all these annotations, as I found
it's hard to refer their definitions
at different places. So here I am trying to initiate this table:

D: The sum of the differences
M: The sum of the matches
ds: The differences
ms: The matches
Q(d): queues of signed differences
Q(m): quests of unsigned matches
dP: difference pattern on level 1
L: same-d-sign length
Md: total match between consecutive differences
pMd: projected Md
amd: average or mean match per difference
rM: M/I the ratio of total matches over the inputs
pVd: relative value of pMd
aMd: Md that co-occurs with mean additive match on higher level
Q(ddP): queue of second derivation patterns
ddP: second derivation patterns
Ld: same-dd-sign length
dds: The differences of differences
mds: The matches of differences
Dd: The sum of dds
Md: The sum of mds
Q(dd): discrete dds
cdP:
rM:
aM: mean current-level match that coincides with mean higher-level match
from evaluated operation
rMP: span of same-sign relative match
Q(cdP)
D_cdP:
M_cdP:
cdP_DPs:


To be honest, I've never read any annotations that are so confusing. No way
for me to even understand them without a running
example. I think I need to simulate an input interger string and then
compute the values of each variable above
so that I could understand the exact definitions of them and the whole
process.

I am not sure even for ms. I don't see any matchs between adjacent integers
in the input?

Input: 19283712378501

D: The sum of the differences
M: The sum of the matches
ds: The differences 8 -7 6 -5 4 -6 1 1 4 1 -3 -5 1
ms: The matches
Q(d): queues of signed differences
Q(m): quests of unsigned matches
dP: difference pattern on level 1
L: same-d-sign length
Md: total match between consecutive differences
pMd: projected Md
amd: average or mean match per difference
rM: M/I the ratio of total matches over the inputs
pVd: relative value of pMd
aMd: Md that co-occurs with mean additive match on higher level
Q(ddP): queue of second derivation patterns
ddP: second derivation patterns
Ld: same-dd-sign length
dds: The differences of differences
mds: The matches of differences
Dd: The sum of dds
Md: The sum of mds
Q(dd): discrete dds

Dan

Boris Kazachenko said...

Thanks Dan!

> I think it's better to use '--' instead of '-' for the

How about I use ~ or = in relatively technical parts?

> I think it's better to show a table for all these annotations,

Great, I just published a new post:
“Collaboration with Dan He”, part 1:
Annotation for part 3 of the intro:

is: initial single-variable inputs, such as pixels
ds: differences between consecutive inputs
ms: partial matches between consecutive inputs
L: same-sign length, initially same-d-sign Ld

dP: difference pattern defined by Ld & containing:

Ld: same-d-sign length,
i: last input
I: sum of inputs,
D: sum of differences between consecutive inputs,
M: sum of partial matches between consecutive inputs,
rM: ratio of M/I
Q(d): queue of differences ( Q(m) is not recorded )

Feedback to the 1st level to determine 2nd derivation:

amd: average | mean match per magnitude of difference
aMd: Md that co-occurs with mean additive match on a higher level

Evaluation for second derivation within dP:

pMd: projected Md: (D-last_d) * amd * rM
pVd: value of projected match: pMd - aMd

Q(ddP): queue of second derivation patterns, formed by comparing
consecutive ds within Q(d) of dP with positive pVd
ddP: second derivation pattern defined by Ldd & containing:

Ldd: same-dd-sign length
d: last difference
D: sum of differences
Dd: sum of dds: differences between consecutive differences
Md: sum of mds: matches between consecutive differences

Conditional on presence of amdd & aMdd
(feedback to determine 3rd derivation):

rMd: ratio of Md/D
Q(dd): queue of dds

dP_ddPQ: combined 1st & 2nd derivation pattern,
defined by dP & containing:
dP(), 2D=D+Dds, 2M=M+Mds, Q(ddP)

cdP: combined positive dP & negative dP complemented pattern:

D_cdP: positive dP_D + negative dP_D
M_cdP: positive dP_M + negative dP_M

To be continued.

> To be honest, I've never read any annotations that are so confusing.

That’s what happens when you work alone, lots of things are implicit. There is an advantage to that too, - brevity helps to keep a big picture in mind. But yeah, my notation still needs work.

> No way for me to even understand them without a running example.

I think what matters is understanding *why* I do these operations.

> I am not sure even for ms.
> I don't see any matches between adjacent integers in the input:
> 19283712378501

1223311237500: partial match is the smaller of two inputs, see part 2.
These are absolute matches, selection is by relative projected match.
Also, your string is random, real sensory inputs (pixels) have far better
similarity / proximity correlation: average match will be higher.
That’s why image compressing transforms work.

Thanks again Dan, I owe you another $200 :).

Dan He said...

For input 1 9 2 8 3 7 1 2 3 7 8 5 0 1

is: initial single-variable inputs, such as pixels 1 9 2 8 3 7 1 2 3 7 8 5 0 1
ds: differences between consecutive inputs : 8 -7 6 -5 4 -6 1 1 4 1 -3 -5 1
ms: partial matches between consecutive inputs: 1 2 2 3 3 1 1 2 3 7 5 0 0
D: sum of ds = 0 (Wow, by random chance?)
M: sum of ms = 25
Ld: same-d-sign length: not sure about this
L: same-sign length, initially same-d-sign Ld: not sure about this
Q(d): queue of differences: the same as ds? 8 -7 6 -5 4 -6 1 1 4 1 -3 -5 1
i: last input = 1
I: sum of inputs = 57
rM: ratio of M/I = 25/57 = 0.439

amd: average | mean match per magnitude of difference: not sure about this
aMd: Md that co-occurs with mean additive match on a higher level: not sure about this
pMd: projected Md: (D-last_d) * amd * rM
pVd: value of projected match: pMd - aMd

What's the point to define the above 4 variables?

Boris Kazachenko said...

Thanks Dan!

As I said, your string is not an "interesting" input, so it won't justify deep comparison. L is a class, not a specific variable.
D & M are only computed within same-d-sign length dL, which in your example always=1, except for two stretches:
12478: +dL = 4, D=7, M=7, Q(d): 1 1 4 1; &
850:.-dL=2, D=8, M=5, Q(d):-3,-5;

Q(d) is recorded in case there is a need for second derivation: positive pVd. Which depends pMd, which depends on feedback variables: aMd & amd.
The last two are derived from wider range of past inputs, represented on a higher level. So, I can't compute them from your string, you need a wider context for that, but it looks like neither of these two "difference patterns" will justify second derivation.

Also, my definition of pMd was wrong, it should be:
pMd = ((D-last_d) * amd) + ((D-last_d) * rM)) / 2.
That's an average computed from two different methods to project Md.
I'll post an update soon.

Regards,
Boris

Dan He said...

For two stretches:
12378: +dL = 4, D=7, M=7, Q(d): 1 1 4 1; &
850:.-dL=2, D=8, M=5, Q(d):-3,-5;

for stretch 12378
I could see D = 1 + 1 + 4 + 1, but how M is computed?

for stretch 850
D and M are both positive? D = abs(-3 -5) = 8?
how M is computed?

Boris Kazachenko said...

Dan,

M = sum of ms, m = min(a,b), which is a complementary of co-derived d.
So, min(1,2)+min(2,3)+min(3,7)+min(7,8) = 13
Sorry, my mistake in last reply, it's not 7

> D = abs(-3 -5) = 8?

D is not signed because negative sign is already indicated at -dL.

Dan He said...

But I still don't know how to compute amd and aMd.

For amd, what does "mean match per magnitude of difference" mean?

For aMd, what does "Md that co-occurs with mean additive match" mean? Md is a value, which is the sum of the matches between consecutive differences. So how could a value "co-occurs" with a match? And what does "mean additive match" mean?

I know the following 4 variables are for the second level derivation, but why you make such definitions for them?

amd: average | mean match per magnitude of difference: not sure about this
aMd: Md that co-occurs with mean additive match on a higher level: not sure about this
pMd: projected Md: (D-last_d) * amd * rM
pVd: value of projected match: pMd - aMd


Boris Kazachenko said...

Dan,

> For amd, what does "mean match per magnitude of difference" mean?

On a higher level, both Md & D are summed over multiple input spans into, say, hLe_Md & hLe_D. Then amd = hLe_Md / hLe_D.
As I said, magnitude of comparands crudely predicts their match because the latter is subset of the former. So, amd * D is one way to predict Md.
rM * D is another way to predict Md, assuming that rM is the same for I & D.
So, I combine these two ways to predict Md: pMd = (((D-last_d) * amd + ((D-last_d) * rM)) / 2

These two ways are not equal, but I haven't figured exact proportion yet.

pMd = (D-last_d) * amd * rM was wrong, I corrected it.

> For aMd, what does "Md that co-occurs with mean additive match" mean?

Md is a specific type of match on the 1st level,
mean additive match on a higher level: mam_hLe, =
(sum of all matches formed on that level, not carried on from lower levels)
/ (sum of all inputs to that level)

aMd = value of Md in a higher-level input that also has mam_hLe,
so it's a predictor of mam_hLe. That is another principle of prediction: lower-level match predicts higher-level match. If it didn't, the inputs would be random, & there is no point in pattern discovery.

So, I compare ds if their projected match, pMd, is at least high enough to predict mam_hLe. That's a more precise way of expressing "above average".

Dan He said...

Given ds: differences between consecutive inputs : 8 -7 6 -5 4 -6 1 1 4 1 -3 -5 1

mds: The matches of differences: -7, -7, -5, -5, -6, -6, 1, 1, 1, -3, -5, -5

Mds: The sum of mds = -46

So I am confused about your statement "Md is a specific type of match on the 1st level, ". Md is value. What do you mean it's a specific type of match?

What is "mean additive match on a higher level: mam_hLe, =
(sum of all matches formed on that level, not carried on from lower levels)
/ (sum of all inputs to that level)" in this case?

Also for "aMd = value of Md in a higher-level input that also has mam_hLe,", what do you mean a value of Md (in this case, -46) that also has mam_hLe? How could a value also has another value?

And you are saying hLe_Md & hLe_D are not computable in this example as they are in higher level?

Also you said there are two ways to predict Md: amd*D and rM*D. I would like to compute these two values and see how different they are.

Finally you mentioned this formula "pMd = (((D-last_d) * amd + ((D-last_d) * rM)) / 2" is a prediction of Md. But we are able to compute Mds directly right? Why we need to predict it? And how do we measure the accuracy of the prediction if we don't know Md?

Boris Kazachenko said...

Dan,

> mds: The matches of differences: -7, -7, -5, -5, -6, -6, 1, 1, 1, -3, -5, -5
> Mds: The sum of mds = -46

mds are computed for same-sign differences only, & are always positive.
Different sign already tell you that the differences won't match, - there's no point in comparing.

> Md is value. What do you mean it's a specific type of match?

It's a variable that represents total match of ds. M is variable that represents total match of is.
And so on. These are different types of match, all represented & compared on higher levels.

> What is "mean additive match on a higher level: mam_hLe, =
> (sum of all matches formed on that level, not carried on from lower levels)
> / (sum of all inputs to that level)" in this case?

Higher-level inputs contain multiple variables, including those that represent match produces on lower levels. All these variables are compared, which gives you *additive* match.

> "aMd = value of Md in a higher-level input that also has mam_hLe,",

A pattern that produced mam_hLe (a specific new variable) also contains Md: another variable. So, these co-occurring Mds are summed on the next level to form, say, Md_hhLe, which is then divided by the number of patterns to form an average: aMd.

> And you are saying hLe_Md & hLe_D are not computable in this example as they are in higher level?

Correct.

> Also you said there are two ways to predict Md: amd*D and rM*D. I would like to compute these two values and see how different they are.

You computation will be case-specific, you can't generalize from it. The difference should be deduced a priori.

> Finally you mentioned this formula "pMd = (((D-last_d) * amd + ((D-last_d) * rM)) / 2" is a prediction of Md.
> But we are able to compute Mds directly right? Why we need to predict it?

Any comparison beyond original inputs must be selective, otherwise you get combinatorial explosion.
But this prediction does seem too complex for Md, I try to simplify it,

> And how do we measure the accuracy of the prediction if we don't know Md?

Md doesn't come from outer space, it is related to all other variables by derivation process.
I'll get back to you on that latter.

Boris Kazachenko said...

Dan,

So, I now think that:
Md = (((D-last_d) * amd) + ((D-last_d) * M/I )) / 2 is too complex the 1st level, &
m_cdP = (mag * am_cdP + cM * amm_cdP) / 2 is too complex the 2nd level.

A cruder predictors should suffice, I just posted updated annotation table:

…feedback to the 1st level that determines 2nd derivation:

aD: average (D-last_d) in higher-level patterns that also contain mean value of Md
(total match is a subset of total comparands, so D-last_d is a crude predictor of Md)

vD = (D-last_d) - aD: evaluation for 2nd derivation within dP
if vD is positive, consecutive ds within Q(d) are compared…

…feedback to 2nd level for evaluating cdP inclusion into positive or negative vP is
acM: average cM in higher-level patterns that also contain mean value of m_cdP

(basic assumption is that total lower-level match cM predicts higher-level match
m_cdP: combined match between same-type variables of consecutive cdP pair)

cdP evaluation for inclusion into vP: v_cdP = cM - acM
consecutive cdPs with same v_cdP sign are included in corresponding
vPs: positive or negative value pattern of cdPs, defined by L_cdP & containing…

> And how do we measure the accuracy of the prediction if we don't know Md?

We do, from the cases where it's projected value is positive & ds are compared. We don't know it for the cases where projection is negative, but we can extrapolate the ratio between projected Md & actual Md that we get from positive cases.

Dan He said...

By "A pattern that produced mam_hLe (a specific new variable) also contains Md", you mean the pattern also produces Md? The words like "contains", "has" confused me, making me think Md is something like a sub-pattern. And by "co-occurring Mds", you mean multiple Mds from the same input?

"Higher-level inputs contain multiple variables, including those that represent match produces on lower levels." So "mean additive match on a higher level: mam_hLe" relies only on the inputs on the higher level directly, not the matches produced on lower levels, right? Then what are the possible inputs on the higher levels besides the matches produced from lower levels? For the level 0, we have the raw inputs as pixels from the image, but what about for higher levels?


"Any comparison beyond original inputs must be selective, otherwise you get combinatorial explosion. " I agree, but how many predictions you are conducting? And you are using the prediction to prune further predictions? If you want to avoid combinatorial explosion, you need to prune the search space. I didn't see this part clearly.


Dan He said...
This comment has been removed by the author.
Boris Kazachenko said...

> By "A pattern that produced mam_hLe (a specific new variable) also contains Md", you mean the pattern also produces Md?

Dan, a higher-level input is a lower-level pattern, that consists of multiple variables. See my definitions of dP, ddP, cdP, vP & so on. Their parameters are variables, & each is compared on the next level, generating new variables. So, higher level input contains a record of Md produced on a lower level, among all other parameters.
I know you have other things on your mind, but we've covered this several times already.

> The words like "contains", "has" confused me, making me think Md is something like a sub-pattern.

It's a variable. A pattern is a set of matching inputs, but its representation on a higher level is a single input containing several variables. That's compression.

> And by "co-occurring Mds", you mean multiple Mds from the same input?

No, a mean is an average of multiple inputs. You only have one variable of a given type per input.
Higher level patterns contain both Md & mam_hLe, among other variables.
So, aD is an average of Ds from hLe patterns containing average Md. That aMd is an average of Mds from hhLe patterns containing mam_hhLe. And so on, till you run out of levels.


>"Higher-level inputs contain multiple variables, including those that represent match produces on lower levels."
> So "mean additive match on a higher level: mam_hLe" relies only on the inputs on the higher level directly,

It's a sum of matches from all variables in an input: m_M + m_D + m_Ld + m_Md + m_Dd, & so on.

> not the matches produced on lower levels, right?

Right, but it includes match of variables representing matches produced on lower levels, see above.

> Then what are the possible inputs on the higher levels besides the matches produced from lower levels?
> For the level 0, we have the raw inputs as pixels from the image, but what about for higher levels?

Again, see my definitions of dP, ddP, cdP, vP...

> "Any comparison beyond original inputs must be selective, otherwise you get combinatorial explosion. "
> I agree, but how many predictions you are conducting?

It's evaluation, any input is already a prediction of adjacent inputs.
Cost-benefit analysis: value of evaluation = |net-negative value of avoided comparisons| - cost of evaluation.
This analysis itself can be indefinitely complex, but it must be significantly less so than operations being evaluated.

The first selective comparison: between ds, actually has a non-selective partial comparison: AND(sign).
Matching sign defines dP, which is evaluated for multiple-d comparisons at once. So, you have one subtraction: (D-last_d) - aD, that determines multiple subtractions between ds. But this case is unusual because the rate of evaluation per comparison is not adjustable.

> And you are using the prediction to prune further predictions?

You prune inputs: feedforward. Evaluation criteria, such as aD, is an input-driven feedback. But there is a higher-order evaluation on a higher level, where you have a representation of multiple spans of inputs, defined by different value of evaluation criteria that pruned them.
You always have a single evaluation for multiple operations at once, otherwise it's not cost-effective.
So, that aD is also evaluated, but on a higher level & at a lower frequency than ds.

But I am not sure how to determine composition of such evaluation criteria automatically.
Anyway, a good question. I owe you :).

> If you want to avoid combinatorial explosion, you need to prune the search space. I didn't see this part clearly.

Higher-level are value patterns: spans of lower-level outputs. So, the number of inputs is reduced. The number of variables per input increases, but not as much. Lower-level inputs are only compared within selected positive value patterns: that's a two-step pruning. See "annotation" for 2nd level.

Dan He said...

Boris, as you mentioned my string is not interesting. I just took a real string converted from an image. It's indeed a hex string. One question: Is your method able to deal with all types of numeric strings, such as hex, base64 etc?

The input string:
is: initial single-variable inputs, such as pixels: = 652131242424242361236123612361213124242456242424234434242461272424242451242132242132242424242436242424213077245621

As the string is getting longer, I just wrote a java program to compute all the values.

ds: differences between consecutive inputs := -1 -3 -1 2 -2 1 2 -2 2 -2 2 -2 2 -2 1 3 -5 1 1 3 -5 1 1 3 -5 1 1 3 -5 1 -1 2 -2 1 2 -2 2 -2 2 1 1 -4 2 -2 2 -2 2 -2 1 1 0 -1 1 -2 2 -2 2 2 -5 1 5 -5 2 -2 2 -2 2 -2 2 1 -4 1 2 -2 -1 2 -1 0 2 -2 -1 2 -1 0 2 -2 2 -2 2 -2 2 -2 2 -1 3 -4 2 -2 2 -2 2 -2 -1 2 -3 7 0 -5 2 1 1 -4 -1
ms: partial matches between consecutive inputs : = 5 2 1 1 1 1 2 2 2 2 2 2 2 2 2 3 1 1 2 3 1 1 2 3 1 1 2 3 1 1 1 1 1 1 2 2 2 2 2 4 5 2 2 2 2 2 2 2 2 3 4 3 3 2 2 2 2 4 1 1 2 2 2 2 2 2 2 2 2 4 1 1 2 2 1 1 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 2 2 2 2 2 2 2 1 1 0 0 7 2 2 4 5 2 1

In your opinion, is the input string "interesting"? If yes, I'll continue the computation for other variables. If not, I'll change a string. But keep in mind that this string is from a real image.

Boris Kazachenko said...

Dan,

> Is your method able to deal with all types of numeric strings

Yes, any integers.

> In your opinion, is the input string "interesting"?

Well, “interesting” are strong patterns. As I said, selection for higher orders of comparison is made for individual patterns rather than a whole string. And vast majority of inputs don’t form strong patterns. We forget almost all that we see, unless we see it many times, which then makes it an “infra” pattern.

In your example, I don’t see any strong difference patterns, so there won’t be any second derivatives. But there might be strong complemented-difference patterns: matches between cdPs. And even if there aren’t many of those, there may still be strong 2D patterns, & so on. In any case, this selection for higher-order comparison must be automatic, It shouldn’t depend on my "opinion".

Again, basic criterion for second derivation is aD: average (D-last_d) in higher-level patterns that also contain mean value of Md.
I am currently trying to determine when to introduce more advanced criteria, such as M: a sum of input matches. A combined D & M criterion would be something like:
average (D * rD + M * rM) in higher-level patterns that also contain mean value of Md. Where rM = Md / M, derived by comparison between variables within a pattern (see part 6). It should becomes a criterion (by being sent as feedback) if it then matches across patterns, thus is predictive of Md. So, the feedback probably has to be from the level beyond the next one.

Higher derivation may seem like minor issue, but the same principles should apply for other types of selective comparison.

Dan He said...

So are you able to show a real example where there are "interesting" patterns?

Boris Kazachenko said...

Dan, "interesting" is relative. I've told you a bunch of times that I select for stronger-than-average patterns. That average is a feedback from higher levels, which represent past experience. It all depends on what the system seen before.
This is the most theoretical problem ever, you can't go by examples.

Dan He said...

Yeah, I know. But without a real example, how could you even tell if the method is applicable to some real world problems.

So what kind of data sets this method targets? I guess for example a set of pictures, where there are common patterns?

Boris Kazachenko said...

I only care about one problem: pattern discovery. The algorithm will be "applicable" if it fits formal definition of "pattern".

Testing won't help to improve it: the results are open to all kinds of interpretation. It will have to be tested for speed, say relative to human vision, but I am not there yet. I still have some loose ends that can only be worked out theoretically.

Initial data set could be something as simple as cat videos, perhaps multiple confocal video streams.

Dan He said...

This is the problem. It's hard for me to fully understand your theories, just by your explanation. And if we can't construct a workable example at this time, I guess I'll have to write some pesudocode then.

Boris Kazachenko said...

Sure, it may help. But I personally prefer to work on a higher level:
~ pseudo pseudo pseudo code.

Dan He said...

But the reason you write the method down is to have other ppl understand it. So it should be down to details.

Boris Kazachenko said...

Well, one of the reasons is that it helps me to focus. And there are other ppl who prefer to start from a higher level. Easy or difficult depends on the background. My approach is not arbitrary. Anyone who starts from the same (I think unavoidable) principles should be arriving at the same conclusions.

But yes, pseudocode is a good idea.

Dan He said...
This comment has been removed by the author.
Dan He said...

I am going to add more code

input is the string of input numbers
D = 0;
M = 0;
dL = 0;
// compute ds, ms
for (int i = 0; i < input.size()-1; ++i) {
ds = input[i+1] - input[i];
ms = min(input[i+1], input[i]);

if ds is of the same sign as ds for i-1 {
D += abs(ds);
M += abs(ms);
if the sign is positive {
mark dL as positive
}
else {
mark dL as negative
}
}
else
D = 0;
M = 0;
dL = 0;
}

Boris Kazachenko said...

Thanks Dan, with some corrections:

D = 0;
M = 0;
dL = 0;
// compute d, m for (int i = 0; i < input.size()-1; ++i: // is that really necessary? all we need is ++pointer_to_i;
) {
d = input[i+1] - input[i];
m = min(input[i+1], input[i]);

while (d_sign = old_d_sign){
D += d; // not absolute, D is signed. I said the sign is indicated at dL, but I guess that would be sightly more complicated. Doesn't really matter.
M += m; // m is always positive
++dL;
old_d = d;
}
D = 0;
M = 0;
dL = 0;
}

That's skipping 2nd derivation, etc.
But do you really think this is a better explanation than the two versions I already posted?