Welcome to the mathematics section
of the Wikipedia reference desk.

Select a section:

WP:RD/MA

Want a faster answer?

Main page: Help searching Wikipedia

How can I get my question answered?

Select the section of the desk that best fits the general topic of your question (see the navigation column to the right).
Post your question to only one section, providing a short header that gives the topic of your question.
Type '~~~~' (that is, four tilde characters) at the end – this signs and dates your contribution so we know who wrote what and when.
Don't post personal contact information – it will be removed. Any answers will be provided here.
Please be as specific as possible, and include all relevant context – the usefulness of answers may depend on the context.
Note:
- We don't answer (and may remove) questions that require medical diagnosis or legal advice.
- We don't answer requests for opinions, predictions or debate.
- We don't do your homework for you, though we'll help you past the stuck point.
- We don't conduct original research or provide a free source of ideas, but we'll help you find information you need.

Ready? Ask a new question!

How do I answer a question?

Main page: Wikipedia:Reference desk/Guidelines

The best answers address the question directly, and back up facts with wikilinks and links to sources. Do not edit others' comments and do not give any medical or legal advice.

April 27

Invariant theory: matrices and invariants under upper triangular matrices

Hi all, I'm learning some invariant theory for rings and I'm getting myself a bit confused with this question - feel like I might have made a mistake, and would appreciate some feedback from someone more experienced than me.

If $G$ acts on $S$ , we write $S^{G}=\{s\in S:gs=s\,\forall \,g\in G\}$ for the invariant ring. In particular, if an element g acts on some finite dimensional vector space W over $\mathbb {C}$ , then for an element $f:W\to \mathbb {C}$ of the coordinate ring $\mathbb {C} [W]$ , we have an action on f by $g:f(\cdot )\mapsto f(g\,\cdot )$ .

We let $M_{2}(\mathbb {C} )$ denote the 2x2 matrices over $\mathbb {C}$ , $U$ denote the upper triangular unipotent matrices (that is, matrices with '1's on the diagonal, a zero below and anything above), and $T$ denote the matrices of the form $\operatorname {Diag} (t,t^{-1})$ . These both act on $M_{2}(\mathbb {C} )$ by (left) matrix multiplication. I wish to find $\mathbb {C} [M_{2}(\mathbb {C} )]^{U}$ and $\mathbb {C} [M_{2}(\mathbb {C} )]^{T}$ , which I think means the polynomials $f:\mathbb {C} [M_{2}(\mathbb {C} )]\to \mathbb {C}$ (i.e. polynomials in the coordinate ring $\mathbb {C} [M_{2}(\mathbb {C} )]$ ) which are fixed under the map $f(A)\mapsto f(MA)$ for any matrix M in $U,\,T$ respectively.

So a matrix in U looks like ${\begin{pmatrix}1&t\\0&1\end{pmatrix}}$ ; this takes ${\begin{pmatrix}a&b\\c&d\end{pmatrix}}\mapsto {\begin{pmatrix}a+tc&b+td\\c&d\end{pmatrix}}$ . An f which is invariant under this transformation can be considered (i think) as $f(a,b,c,d)$ rather than $f({\begin{pmatrix}a&b\\c&d\end{pmatrix}})$ , so that $f(a,b,c,d)=f(a+tc,b+td,c,d)$ : so the question is essentially asking us, if I understand correctly, to find the polynomials f which are invariant under that transformation. What can we say about such polynomials? I tried expanding it as

$f(x_{1},x_{2},x_{3},x_{4})=\sum \limits _{i,j,k,l\geq 0}A_{ijkl}x_{1}^{i}x_{2}^{j}x_{3}^{k}x_{4}^{l}=\sum \limits _{i,j,k,l\geq 0}A_{ijkl}(x_{1}+tx_{3})^{i}(x_{2}+tx_{4})^{j}x_{3}^{k}x_{4}^{l}$

for all i, j, k, l, t.

I then tried looking at this as either a polynomial in $t$ or a polynomial in $t,x_{1},\ldots ,x_{4}$ ; we know that all coefficients of $t^{r}$ are always zero for $r>0$ , and these coefficients are polynomials in the $x_{m}$ and the $A_{ijkl}$ ; what I really wanted to do is show that these coefficients are necessarily nonzero polynomials in the $x_{m}$ , unless we assume all $A_{ijkl}$ involved are zero; i.e. our polynomial can only be a polynomial in the latter two variables, otherwise it is not fixed for every t.

However, when I tried to determine the coefficient of $t^{r}$ as a poly in the $x_{m}$ , I find that multiple terms in $(*)$ can contribute to the same term $x_{1}^{i'}x_{2}^{j'}x_{3}^{k'}x_{4}^{l'}$ in the coefficient of $t^{r}$ , and in fact the function $x_{1}x_{4}-x_{2}x_{3}$ satisfies the requirements but is obviously a function of all 4 variables (note that this is effectively the determinant, though I don't know if that has any significance). Indeed, functions such only in the latter 2 variables are *included* in our class of possible functions, but they don't make up the whole thing. I'm not sure what more we can say about the class; by choice of t I think we can deduce $f(a,b,c,d)=h(ad-bc,c,d)$ for some h, but I'm not sure where to go from here.

What is the invariant ring exactly? Is it just $\mathbb {C} [c,d,e]$ (where e happens to equal ad-bc)? And likewise with the diagonal matrices, I think we get that all terms in the polynomial must be of the form $A_{ijk(i+j-k)}x_{1}^{i}x_{2}^{j}x_{3}^{k}x_{4}^{i+j-k}$ to be invariant, so would we deduce the invariant ring is $\mathbb {C} [x_{1}x_{4},x_{2}x_{4},x_{3}/x_{4}]$ or something like that? Or maybe just any old 3-variable $\mathbb {C} [X,Y,Z]$ since we effectively have 3 variables? Sorry for the long question, I've just started learning invariant theory and I'm still finding it a bit confusing. Thank you for your help :) 86.26.13.2 (talk) 08:14, 27 April 2012 (UTC)[reply]

April 28

Cubic Interpolation Polynomials

Related to the interpolation polynomial I asked above, in one dimension if I have two points and I know both the function values and the derivatives at those two points, I can fit a unique cubic polynomial through them because I will have four unknowns and four constraints. How can I extend this to 2 dimensions? Let's say I have four points (one cell). What would be the form of the polynomial? Would it be a general bivariate cubic
$f(x,y)=ax^{3}+by^{3}+cx^{2}y+dxy^{2}+...$
or a bicubic like this one
$g(x,y)=(ax^{3}+bx^{2}+cx+d)(ey^{3}+fy^{2}+gy+h).$
The first one has fifteen unknowns and the second has eight. So which constraints are natural to use here? I am thinking of four function values, four x-partials, and four y-partial at each of the four points. But this only gives me twelve constraints. Are some constraints redundant or am I missing some? I am just confused about the "natural" extension of the above method to two variables. Thanks!75.171.224.232 (talk) 00:49, 28 April 2012 (UTC)[reply]

Bicubic interpolation may get you started. —Tamfang (talk) 17:40, 28 April 2012 (UTC)[reply]

Right, I have looked at that article (and others on interpolation I could find on wikipedia) but my question is a little different. If I have ten points and I want to use cubic splines then I will need nine cubic polynomials, one for each interval. The way these "standard" cubic splines work is that I derive a large system and then solve for all nine cubic simultaneously. The problem I have is that I have too many points so interpolating this way is infeasible. So I was just asking if anyone knows of a way where I can solve only one cubic (for one interval) at a time as I need them. Linear interpolation would do this but I also need the gradient to be continuous. So then I said to myself, in one dimension, if I have two points x1 and x2 and I know f(x1),f(x2),f'(x1),f'(x2) then I can find the cubic polynomial for that one interval without having to solve for all others. But I am a little confused how to extend this to two dimensions (or even higher). If I can do something like this it will solve a lot of problems I am having. Any ideas? Thanks.75.171.224.232 (talk) 20:50, 28 April 2012 (UTC)[reply]

Have you looked at Numerical Recipies http://www.nr.com/ They have a much better treatment of the subject that wikipedia and will focus more on application issues. Most of the interpolation methods they use are piecewise ones which only use a small number of points at a time.--Salix (talk): 22:23, 28 April 2012 (UTC)[reply]

One 2D technique they mention (NR 3.6.3) is first calculating values for first derivatives

{\frac {\partial f}{\partial x}}

,

{\frac {\partial f}{\partial y}}

and cross derivative

{\frac {\partial ^{2}f}{\partial x\partial y}}

for each point. With this information you can then treat each patch separately using bicubic interpolation.--Salix (talk): 23:36, 28 April 2012 (UTC)[reply]

Yep, I looked at it and looks very promising. Thanks for this suggestion! One question though, did I read it right? They say (if I don't know the values of the partials at the grid points) then basically just assume whatever values you want? This way I will have the smoothness I want. It may not be very accurate but it will be smooth.75.171.224.232 (talk) 03:41, 30 April 2012 (UTC)[reply]

Yes you can just assume the values for the partials. For example in 1D you could assume the derivatives are alway zero giving a smooth but bumpy surface. Better is to estimate them using finite differences.--Salix (talk): 05:16, 30 April 2012 (UTC)[reply]

Trigonometric Interpolation

Second interpolation question, Bo Jacoby, your idea of using trigonometric interpolation is a pretty cool idea. I didn't even think about it. I only have one question about that if someone can clear it up. I remember from my class that a trig polynomial is
$f(x)=a+b\sin(x)+c\cos(x)+d\sin(2x)+e\cos(2x)...$
so in two dimensions (so that I can use the FFT), what is the form of the polynomial? Is it
$g(x,y)=(a+b\sin(x)+c\cos(x)+...)(d+e\sin(y)+f\cos(y)+...)$
analogous to how we generalize polynomial interpolation like bicubics? Thank you.75.171.224.232 (talk) 01:03, 28 April 2012 (UTC)[reply]

Thanks for asking! Here are some tricks.

The expression cos(2πx)+i sin(2πx) was (by Euler) identified as the complex exponential function e^{2π i x}. This identification makes life easier. Forget all about cos and sin and stick to the exponential e^{2π i x}.
The identity e^{2π i}=1 invites to the (nonstandard) notation 1^x = e^{2π i x} . Most people think that 1^x = 1, but that convention is useless - nobody write 1^x meaning 1. When x is rational then e^{2π i x} is algebraic. So forget about the transcendental numbers e and 2π and stick to the root of unity 1^x.
For fixed positive N, there are only N different values of 1^{n / N} . You may think of n as an integer modulo N. The summation sign Σ_n means summation over N consecutive values of n. It doesn't matter if it is from n=1 to N, or from n=0 to N−1 , or from n=−(N−1)/2 to (N−1)/2 for odd N, or from n=−N/2 to N/2−1 for even N.
A trigonometric polynomial is f(x) = Σ_n a_n 1^{n x / N}/√N where the amplitudes are a_n = Σ_x f(x) 1^{n x / N}/√N. Note that complex conjugating the amplitudes is nonstandard. The standard discrete fourier transform has different procedures for transforming forwards and backwards(!).
A two dimensional trigonometric polynomial is f(x,y) = Σ_n Σ_m a_nm (1^{n x / N}/√N) (1^{m y / M}/√M) where the amplitudes are a_nm = Σ_x Σ_y f(x,y) (1^{n x / N}/√N) (1^{m y / M}/√M).
For interpolation, first transform the function values to amplitudes. Then extend the array of amplitudes from (−N/2 ≤ n < N/2) to (−kN/2 ≤ n < kN/2) for some k>1, with zeroes for the high frequencies. (b_−N/2=b_N/2=a_−N/2/2, b_n=a_n for −N/2 < n < N/2, b_n=b_−n=0 for N/2 < n < kN/2, b_−kN/2=0). Multiply the amplitudes by √k. Transform from amplitudes to function values.

Bo Jacoby (talk) 12:01, 28 April 2012 (UTC).[reply]

For example. x=(0,1,2,3,4,5,6,7,8). f(x)=(2,3,4,6,5,2,1,4,2). The amplitudes are a_n=(9.67,-0.953+2i,-0.929-1.76i,-0.333+1.15i,0.382+0.573i,0.382-0.573i,-0.333-1.15i,-0.929+1.76i,-0.953-2i). Insert 9 zeroes and multiply by √2 : b_n=(13.7,-1.35+2.83i,-1.31-2.49i,-0.471+1.63i,0.54+0.81i,0,0,0,0,0,0,0,0,0,0.54-0.81i,-0.471-1.63i,-1.31+2.49i,-1.35-2.83i). Transform again: g(x)=(2, 2.83, 3, 3.12, 4, 5.3, 6, 5.8, 5, 3.72, 2, 0.706, 1, 2.7, 4, 3.5, 2, 1.34). The even items reproduce the original data, g(2x)=f(x), and the odd items interpolate between them. Bo Jacoby (talk) 11:51, 3 May 2012 (UTC).[reply]

More Interpolation

Let's say I want to interpolate using a ridiculously large number of points (like a trillion points). I know their x-values and the corresponding y-values and I want the interpolation function and its first derivative to be continuous. For this, cubic splines are perfect but the problem is that the resulting system is far too large. A trillion points mean 999,999,999,999 cubic polynomials with four unknowns in each. And the system will have something like (4-trillion)^2 entries. Using even single precision (4 bytes per entry), I would need like a lot of hard drive/RAM space. Granted it will be tri-diagonal, I can save memory using sparse structure, and I can use specialized algorithms like Thomas' algorithm to solve it, it is still too large and will take too much time and memory.

Is there any way (any cool tricks anyone knows of) where I can only solve one cubic polynomial in each interval (from n-th to n+1st point) as I need them? I would much rather solve a bunch of small systems as I need them instead of solving them all in preprocessing. This is why I asked about the cubic spline question above. I figured a cubic in any one interval can be solved independently if I give it four constraints. It doesn't have to be cubic. I would love to hear all of your thoughts on any of this or any cool idea anyone knows of. How can I compute a interpolating function on one cell only, such that the function and its derivative/gradient would be continuous if I compute and stitch all of them together? Please remember that eventually I will be doing this in at least three dimensions so the problem will only become worse much much quickly thanks to the curse....of dimensionality.75.171.224.232 (talk) 01:19, 28 April 2012 (UTC)[reply]

I'm not convinced that I understand your question. No matter how many intervals you have, a cubic for each interval depends on the same number of local values. —Tamfang (talk) 17:48, 28 April 2012 (UTC)[reply]

I think the problem is that while the datapoints are know, the gradients are not. In fitting the splines you need to adjust adjacent patches to match gradients. First thoughts are you could try to drastically reduce the size of you dataset before fitting splines. There are lots of techniques for this start with Dimension reduction. How accurately are the data points obtained will there be some error in their measurement? If so beware of overfitting. There quite a few google hits for "interpolation large datasets" with some interesting techniques, Multi Level B-Spline might be promising.--Salix (talk): 19:01, 28 April 2012 (UTC)[reply]

To assign a first derivative to each sample point, in a one-dimensional sequence, you fit a quadratic to it and its two neighbors; if the samples are equally spaced, this will give a tangent slope equal to the slope of a straight line between the two adjacent samples. In multiple dimensions I imagine it's similar. If the sample points are not a grid, you'd probably have to start with Delaunay triangulation. —Tamfang (talk) 18:48, 1 May 2012 (UTC)[reply]

You can always compute a patch-wise spline and then blend the patches together. For example, if you have 200 points, you might do the following:

Compute 3 splines:
- One on the first 100 points, S1(x)
- One on points 51-150, S2(x),
- One on points 101-200, S3(x)
Define weight functions:
- w1(x) = 1 on points 0-50, declines linearly to zero over points 51-100, and zero otherwise
- w2(x) = grows linearly from zero on points 51-100, declines linearly to zero over points 101-150, and zero otherwise
- w3(x) = grows linearly from zero on points 101-150, 1 on points 151-200, and zero otherwise
Define a composite curve: S(x) = (w1(x)*S1(x) + w2(x)*S2(x) + w3(x)*S3(x)) / (w1(x) + w2(x) + w3(x));

It should be obvious how to extend this to arbitrarily many patches. This will be continuous. With just a little more effort, you can make the blending function smooth and ensure the composite curve is also completely smooth. You aren't guaranteed to have the same result as the full spline, but as long as the patches are large, the edge effects will be small and you get a good approximation of the full spline (assuming you aren't overfitting in the first place). Dragons flight (talk) 19:23, 1 May 2012 (UTC)[reply]

Request for applications

Could anyone supply me with explicit information about the application of second order ODEs? Firstly, I am interested in the case of linear homogeneous ODEs, i.e.

ay''+by'+cy=0\,,

where a, b and c are constants and a is non-zero. Secondly, I am interested in the non-homogeneous case:

ay''+by'+cy=f\,,

where ƒ is a sum, or difference of, exponential and trigonometric functions. — Fly by Night (talk) 21:38, 28 April 2012 (UTC)[reply]

Hooke's law gives an application of the homogeneous case (with a and c terms in your equation). The b term is a damping term (see damping). The f represents the application of a force external to the system. Many other mechanical systems have similar interpretations for their respective differential equations (e.g., RLC circuits). Sławomir Biały (talk) 01:15, 29 April 2012 (UTC)[reply]

The special case cy=f is Hooke's law, that force is proportional to displacement: more load makes a spring longer. The case by'=f is Aristotle saying that force is proportional to velocity: more horses draw a wagon faster. The case ay"=f is Newton's law that force is proportional to acceleration: the falling apple. Bo Jacoby (talk) 04:10, 29 April 2012 (UTC).[reply]

Second-order fixed-coefficients differential equations sometimes come up in economics, with f being either zero or a non-zero constant. Two examples are given in Chiang, Alpha C., Fundamental Methods of Mathematical Economics, McGraw-Hill, 3rd edition, 1984, pp.529-540. One of them equates supply and demand for a good, with demand being affected not only by price P but also by the trend in price given by dP/dt and d²P/dt². The other one involves the economy-wide inflation rate and its first and second derivatives, as it relates to the unemployment rate. In both examples the first and second derivatives appear because of the assumptions about how price or inflation expectations are formed. Duoduoduo (talk) 17:34, 29 April 2012 (UTC)[reply]

An RLC circuit current and voltage response are described with second order ODEs, with f(t) being an external signal (current or voltage) applied to the circuit. --CiaPan (talk) 05:37, 30 April 2012 (UTC)[reply]

April 29

Music computation

Given the beginning of a musical composition, how to mathematically compute the rest? 118.70.177.182 (talk) 04:37, 29 April 2012 (UTC)[reply]

Of one that already exists, or for the generation of new music?--Gilderien Chat|List of good deeds 11:13, 29 April 2012 (UTC)[reply]

Of course, there are multiple ways to end a piece from a given start. However, you could make a program that would apply a standard series of variations (changes in instrumentation, substituting chords for single notes, counterpoint, changes in octave, etc.) to a given musical phrase. StuRat (talk) 11:49, 29 April 2012 (UTC)[reply]

Maybe Analytic continuation?-77.127.57.229 (talk) 12:43, 29 April 2012 (UTC)[reply]

Latex package for Dynkin diagrams

Does anyone know of a decent LaTeX package for drawing Dynkin diagrams? I can only make them look nice if I draw them "manually" (basically with bullets and rules, manually adjusting every space so that it comes out right). I have tried (perhaps inexpertly) to use xypic as well, but the diagrams come out too big (and just not looking quite right either). Does anyone know of a more targeted solution? Sławomir Biały (talk) 12:07, 29 April 2012 (UTC)[reply]

Are you aware of http://lesha.goder.com/dynkin-diagrams.html ? Looie496 (talk) 17:54, 29 April 2012 (UTC)[reply]

Yes, I had seen that a long time back. They didn't really look good enough to me to warrant the pain of and lack of flexibility of importing eps files. I'm looking for a flexible (or at least hackable) LaTeX solution. Given that things like xypic and amscd are possible, surely there must be a native LaTeX solution out there? Sławomir Biały (talk) 21:10, 29 April 2012 (UTC)[reply]

I've been using tikz to draw finite automata recently. It's a graph drawing package, and it should be able to do Dynkin diagrams too; I didn't read any more of the documentation than what I needed for my own stuff, but I noticed it has support for multiple edge types, so it probably has what you need.--130.195.2.100 (talk) 23:26, 29 April 2012 (UTC)[reply]

Thanks, that looks very promising. I'll give it a try. Sławomir Biały (talk) 12:30, 30 April 2012 (UTC)[reply]

PGF/TikZ :-) 86.104.57.242 (talk) 10:23, 6 May 2012 (UTC)[reply]

Have you tried dynkin-diagrams (https://ctan.org/pkg/dynkin-diagrams)?

April 30

question about one sided limits

f is differentiable at all points except possibly at 0, $\lim _{x\rightarrow 0+}f'(x)=\lim _{x\rightarrow 0-}f'(x)$ and f is a continuous at 0. Probe that f is differentiable at 0.--49.178.5.29 (talk) 00:05, 30 April 2012 (UTC)[reply]

And if you're not an alien with your probe handy, you can prove it instead. :-) StuRat (talk) 00:12, 30 April 2012 (UTC) [reply]

Hint: Use the definition of derivative. To bound

f(\Delta x)

, use an intermediate point

x_{0}

, bound

f(x_{0})

and

f'(x)

, and use the Mean value theorem. -- Meni Rosenfeld (talk) 08:48, 30 April 2012 (UTC)[reply]

Calculating the 3 marks that drag a weighted average down the most

Hello,

Apologies if this is a very easy problem, I’m certainly no mathematician.

My university calculates an ‘honours mark’ for each student. The honours mark is basically an average of each student’s marks, weighted according to the number of credit points attributed to that subject. However, Students may discount their 3 ‘worst’ subjects.

If a student’s 3 worst subjects are those that drag their weighted average down the most, how does one go about calculating what those 3 marks are? (Apart from trial and error).

Thanks, Joaq99 (talk) 04:16, 30 April 2012 (UTC)[reply]

Perhaps you could find the weighted average of all the classes, then find the 3 whose deviations from that average, when multiplied by the number of credits, is the most negative number ? StuRat (talk) 04:28, 30 April 2012 (UTC)[reply]

StuRat: can you prove that rule correct? Because I think it might not be. – b_jonas 11:20, 30 April 2012 (UTC)[reply]

StuRat: for example, suppose you wanted to throw away just two marks from the following four:

name	grade	weight
P	1	9
Q	1	1
R	3	10
S	5	10

Now the weighted average is 3, so the deviations from average multiplied by the weight are, respectively, -18, -2, 0, 20, so if I understand your rule correctly, you'd throw away the marks for subjects P and Q, which would give a weighted average of 4. However, it's better to throw away subjects P and R, as that would lead to a grade average near 4.64. – b_jonas 11:30, 30 April 2012 (UTC)[reply]

In fact it's not even true when you need to remove one mark. If A, B and C have weights 1, 2, 1 and grades 1, 0, -0.3, then the StuRat power of B is -0.35 and for C is -0.475, but the average after removing B is 0.35 and after removing C is 1/3. -- Meni Rosenfeld (talk) 11:45, 30 April 2012 (UTC)[reply]

Yes, my method is an approximation, which works in the real world, where you don't get a 10 to 1 ratio in the number of credit per class. To add a bit of a safety factor to it, you could try removing, say, each 3 of your 4 "most negative" classes, calculated by the method I specified. StuRat (talk) 17:21, 30 April 2012 (UTC)[reply]

What is this "real world" of which you speak? :)

Anyway, you'll find that my counterexample has 1:2 weight ratio. Your method is most accurate when there are many classes, in which case removing items doesn't have a great effect on the denominator. -- Meni Rosenfeld (talk) 18:32, 30 April 2012 (UTC)[reply]

Yes, and also where class grades tend to vary more than class credits. StuRat (talk) 19:07, 30 April 2012 (UTC)[reply]

[ec] I don't think this is a trivial problem. StuRat's suggestions is an approximation but the exact result is different, and the marks that are each optimal individually needn't be optimal together.

A mark that is dominated by at least

k=3

other marks (they each have both lower grade and higher weight) cannot be in the optimal set, so these can be discarded (which is

O(n^{2})

and can reduce the effective value of n). But other than that I don't know of a better way than scanning all possibilities, which is

O(n^{k})

in the general case.

Some optimization is possible by calculating each average in

O(k)

rather than

O(n)

. -- Meni Rosenfeld (talk) 11:36, 30 April 2012 (UTC)[reply]

Meni: nice example for removing a single grade. I agree that this is an interesting mathematical problem, even though with real life students and grade averages it's feasable to do a brute force computer solution, or a hand computation with fast runtime for typical input.

Now as for the actual question. If you wanted to remove just one grade, then you could compute the average without each grade all in linear time. Would it give the correct result to just iterate this, repeatedly throwing away a grade in a greedy way?

Also, as a clarification, can we assume that the grades are limited to just a few values (say integers from 1 to 5)? If so, that would make this simpler. – b_jonas 12:41, 30 April 2012 (UTC)[reply]

Greedy doesn't work. Let A, B, C, D have grades 7, 2, 2, 0 and weights 2, 2, 2, 1. If you can remove one mark it should be D, but if you can remove two it's B and C. -- Meni Rosenfeld (talk) 12:59, 30 April 2012 (UTC)[reply]

Huh? I can't reproduce that one. If you can remove two, it should be C and D to get an average of -1, because if you removed B and C the weighted average would be -1.4. – b_jonas 14:59, 30 April 2012 (UTC)[reply]

Sorry, had a typo, weight of A should be 1 (2 now that I've rescaled). -- Meni Rosenfeld (talk) 15:45, 30 April 2012 (UTC)[reply]

Ah, indeed, it does work that way. Now I should try to understand why it works. – b_jonas 16:15, 30 April 2012 (UTC)[reply]

I can't say I truly get it myself. But here are two ways to think about it (which may be easier now that I've rescaled the example to use only nonnegative integers):

Use the StuRat approximation. D has slightly more StuRat power so if only one item is removed, it should be it. If two need to be removed, then clearly B needs to be one of them. Once that's done, the average is higher; since C has greater weight than D, this has a greater effect on its StuRat power, which now exceeds D's. (This may or may not be literally true for this example, didn't check).
Consider the numerator and denominator of the weighted average, and how removing an items affects them both. With one item removed, it should be D because of its effect on the numerator. The less the total weight, the more significant is the effect on the denominator; so after B is removed, C is next because removing it greatly reduces the denominator.

-- Meni Rosenfeld (talk) 18:32, 30 April 2012 (UTC)[reply]

Another way to look at it: I have a backpack. I want to cram it full of grades. I can only hold so much weight and the value per weight of each grade is not the same. How can I maximize the value? Remind anyone of an old, well-documented problem? — Preceding unsigned comment added by 128.23.112.209 (talk) 18:38, 30 April 2012 (UTC)[reply]

The invalidity of the one-at-a-time approach strikes me as somehow related to Simpson's paradox -- both involve comparing things with different but overlapping denominators.

Also, this problem of efficiently finding the ones to remove seems very similar to this problem: given a regression equation in which it is postulated that k of the n data points are outliers that should not have been included in the regression (because they may be from a different causative structure), how do you efficiently find the set of k data points whose deletion will most improve the fit of the regression? See for example Cook's distance and Outlier#Identifying outliers. Duoduoduo (talk) 19:31, 30 April 2012 (UTC)[reply]

Just an aside, but the harder the true answer is to find, the less likely that the university is actually applying the "correct" solution. Joaq99, I would surprised if the university has really thought this out. More likely someone is doing something simple, like canceling the three lowest grades regardless of weight. While the mathematical puzzle is undoubtedly interesting, you might be more likely to get at the truth by inquiring with your university about what procedure they are actually using to remove the "worst" grades. Dragons flight (talk) 19:45, 30 April 2012 (UTC)[reply]

This http://www.ics.uci.edu/~eppstein/pubs/EppHir-TR-95-12.pdf paper addresses this particular problem. --Modocc (talk) 20:55, 30 April 2012 (UTC)[reply]

Dragons flight: in a realistic case, it's completely feasable for the university to even use a brute force computation with a computer. We have implicitly started discussing the abstract problem where the number of grades can be large, and the allowed grade values and the allowed grade weights needn't be integers taken from a very small predetermined set. – b_jonas 21:26, 30 April 2012 (UTC)[reply]

Modocc: nice find! That's indeed exactly the same problem. – b_jonas 21:29, 30 April 2012 (UTC)[reply]

Notably, they give an

O(n)

algorithm. -- Meni Rosenfeld (talk) 05:21, 1 May 2012 (UTC)[reply]

From the above conversation it seems that a 'brute force' calculation is the way to go. I'm assuming this refers to a computer program that calculates the weighted average eliminating every possible combination of 3 and simply takes the maximum of all the weighted averages.

I would like to thank everyone for their efforts (even if you enjoy it, it's much appreciated). I'm interested in this discussion even beyond its practical implications for me.

Dragons flight -- The university was initially removing the 3 subjects with the lowest (mark * weight). When I complained that this was clearly wrong, the university said they would change their approach to simply removing the 3 subjects with the lowest marks. I'm unhappy with this approach too but I thought that before I proceeded with the complaint I should try and figure out the correct way to do so. I soon figured out that finding such a method was beyond my high school mathematics skills. Joaq99 (talk) 01:30, 1 May 2012 (UTC)[reply]

Lowest mark*weight, or lowest (mark-average)*weight (StuRat power)? So a grade of 1 with weight 1 will be removed rather than a grade of 1 with weight 100? Most negative StuRat power is a handy approximation; lowest grade is a handy approximation; lowest mark*weight is just idiotic and whoever is responsible for it should be fired.

They should just use brute force. It takes exactly 5 minutes to write a program to do that. Is the final grade of all university students really that unimportant? -- Meni Rosenfeld (talk) 04:30, 1 May 2012 (UTC)[reply]

Good luck with that Joaq. I've been part of several universities, and though they all had many smart faculty, I've pretty much invariably found that the staff responsible for processing grades, transcripts, and the like had trouble understanding all but the most trivial of mathematics. I'm not sure why that should be, but it certainly seems like whatever the actual requirements for that job (mostly clerical skills I assume), that numerical literacy was not among them. The only calculations I really trust them to do and describe accurately are the very simplest imaginable. Of course that's just my experience, your experience may be better. Dragons flight (talk) 05:29, 1 May 2012 (UTC)[reply]

A closer look: If the sum of all weights is W and the sum of all weight*grade is T, then the average starts at

A={\frac {T}{W}}

. If a class with weight

w_{1}

and grade

g_{1}

is dropped, the new average is

{\frac {WA-w_{1}g_{1}}{W-w_{1}}}=A+{\frac {(A-g_{1})w_{1}}{W}}+(A-g_{1})\left({\frac {w_{1}}{W}}\right)^{2}+O\left(\left({\frac {w_{1}}{W}}\right)^{3}\right)

. The second term has the StuRat power in it which is why the approximation works when W is large enough.

If you drop two classes, the average is

{\frac {WA-w_{1}g_{1}-w_{2}g_{2}}{W-w_{1}-w_{2}}}=A+(A-g_{1})\left({\frac {w_{1}}{W}}+\left({\frac {w_{1}}{W}}\right)^{2}\right)+(A-g_{2})\left({\frac {w_{2}}{W}}+\left({\frac {w_{2}}{W}}\right)^{2}\right)+(2A-g_{1}-g_{2}){\frac {w_{1}w_{2}}{W^{2}}}+O\left(\left({\frac {w_{1}+w_{2}}{W}}\right)^{3}\right)

That is, to second order you get the same terms as with dropping each individually, but with the additional term

(2A-g_{1}-g_{2}){\frac {w_{1}w_{2}}{W^{2}}}

which can give an advantage over the best individual class to drop. In particular, a class which has a low weight and low grade will greatly reduce the

w_{1}w_{2}

part (because the weights are multiplied) but will only somewhat increase

(2A-g_{1}-g_{2})

(since the grades are added, and a doubling of the distance from the average is diluted by the other grade), thus it may be better to drop two classes of higher grade but also higher weight. -- Meni Rosenfeld (talk) 10:58, 1 May 2012 (UTC)[reply]

The Eppstein-Hirschberg paper cited above by Modocc, despite a promising opening paragraph, is not about the same problem as we are talking about here. That paper maximized

{\frac {\sum _{i\in T}v_{i}}{\sum _{i\in T}w_{i}}}

where

v_{i}

is a grade and

w_{i}

is a weight (see their eqs. (1) and (2)). But the grade-point averages that we are trying to maximize are

{\frac {\sum _{j\in T}v_{j}*w_{j}}{\sum _{i\in T}w_{i}}}

. Duoduoduo (talk) 17:34, 1 May 2012 (UTC)[reply]

Our problem is trivially reduced to the problem in the paper by setting

v_{i}=g_{i}\cdot w_{i}

. -- Meni Rosenfeld (talk) 18:12, 1 May 2012 (UTC)[reply]

May 1

May 2

May 3

Octahemioctahedron

Why is the octahemioctahedron the only orientable hemipolyhedron? Double sharp (talk) 12:23, 3 May 2012 (UTC)[reply]

It seems from the article that there are only 9 hemipolyhedrons (I don't know why these are the only ones). It's not hard to check that the octahemioctahedron is the only orientable one. Look at the faces that don't pass through the center. Ones that meet at a vertex have the opposite orientation. In order for the polyhedron to be orientable, there can't be any odd walk through the outside faces sharing vertices. This only happens on the octahemioctahedron (which is due to the fact that the outside faces are arranged like the vertices of a cube, which is bipartite). Rckrone (talk) 05:37, 5 May 2012 (UTC)[reply]

4-fold and 5-fold rotational symmetry

Why is it that Euclidean polyhedra cannot have both 4-fold and 5-fold rotational symmetry? What about polychora (4-polytopes) and higher? Double sharp (talk) 12:31, 3 May 2012 (UTC)[reply]

I think is due to the fact that if 4 pentagons met at a vertex, the total angle here would be 4 × 108° = 432. Slightly more rigorously would be to consider Möbius triangles. As 1/2 + 1/4 + 1/5 < 1 the corresponding Möbius triangle with angles π/2, π/4, π/5 will lie in the hyperbolic plane giving the Order-5 square tiling or Order-4 pentagonal tiling. Whats confusing me at the moment is Schwarz triangle with angles π/2, π/4, 2π/5, here 1/2 + 1/4 + 2/5 > 1 so it should be spherical, but it does not seem to exist.--Salix (talk): 07:37, 5 May 2012 (UTC)[reply]

How widespread is the notation

$b){\overline {~a~}}$ in elementary schools around the world? (meaning a divided by b).— Preceding unsigned comment added by ‎ OsmanRF34 (talk • contribs) 21:51, 3 May 2012

According to the Long division article, its the notation used in English speaking countries, China, Japan and India. There are several different notations used one in Brazil, Venezuela and Colombia; one in the rest of Latin America; one in Spain, Italy, France, Portugal, Romania, Turkey, Greece and Russia; one in Germany, Norway, Macedonia, Poland, Croatia, Slovenia, Hungary, Czech Republic, Slovakia and in Bulgaria. --Salix (talk): 22:49, 3 May 2012 (UTC)[reply]

May 4

Uniform dual polyhedra

If all the Catalan solids have constant dihedral angles, do all the duals of the uniform polyhedra (the nonconvex ones) also have constant dihedral angles? (I suspect the answer is yes.) Double sharp (talk) 03:12, 4 May 2012 (UTC)[reply]

Yeah, the canonical dual having constant dihedral angles follows from the original polyhedron having regular polygons as faces. It shouldn't depend on convexity of the polyhedron, or even convexity of the faces. Rckrone (talk) 04:58, 5 May 2012 (UTC)[reply]

May 5

Summation

Could someone please explain to me why $\sum _{i=0}^{m-r-1}{m \choose m-i}=\sum _{j=r+1}^{m}{m \choose j}$ . I can see we're using the substitution j=i+r+1 but I don't understand why the binomial coefficient ends up the way it does. Thanks. 131.111.184.11 (talk) 12:51, 5 May 2012 (UTC)[reply]

The binomial coefficients are reversed in the order of summation with the substitution you have used. Rather consider the substitution

j = m - i

, and you will have to swap the limits because generally

m \geq r + 1

. — Quondum ^☏ 13:23, 5 May 2012 (UTC)[reply]

Normalise a DE

What does it mean to "normalise" a DE? I am told to normalise $y''-xy=0$ by the substitution $t={\frac {2}{3}}x^{3/2}$ . My working out is ${\frac {dy}{dx}}={\frac {dy}{dt}}{\frac {dt}{dx}}={\frac {dy}{dt}}x^{\frac {1}{2}}$ . ${\frac {d^{2}y}{dx^{2}}}={\frac {dt}{dx}}{\frac {d}{dt}}{\frac {dy}{dx}}={\frac {dt}{dx}}{\frac {d}{dt}}({\frac {dy}{dt}}{\frac {dt}{dx}})=x^{\frac {1}{2}}{\frac {d}{dt}}({\frac {dy}{dt}}x^{\frac {1}{2}})=x^{\frac {1}{2}}({\frac {d^{2}y}{dt^{2}}}x^{\frac {1}{2}}+{\frac {dy}{dt}}{\frac {1}{2}}x^{-{\frac {1}{2}}}{\frac {dx}{dt}})=x^{\frac {1}{2}}({\frac {d^{2}y}{dt^{2}}}x^{\frac {1}{2}}+{\frac {1}{2}}x^{-1}{\frac {dy}{dt}})=x{\frac {d^{2}y}{dt^{2}}}+{\frac {1}{2}}x^{-{\frac {1}{2}}}{\frac {dy}{dt}}$ . So the DE is $x{\frac {d^{2}y}{dt^{2}}}+(-x^{\frac {3}{2}}+{\frac {1}{2}}x^{-{\frac {1}{2}}}){\frac {dy}{dt}}=0$ . $t={\frac {2}{3}}x^{\frac {3}{2}}\Rightarrow x=\left({\frac {3}{2}}t\right)^{\frac {2}{3}}$ .

So the DE is $\left({\frac {3}{2}}t\right)^{\frac {2}{3}}{\frac {d^{2}y}{dt^{2}}}+(-{\frac {3}{2}}t+{\frac {1}{2}}\left({\frac {3}{2}}t\right)^{\frac {2}{3}}){\frac {dy}{dt}}$ . Doesn't look very normal to me. 150.203.114.37 (talk) 20:11, 5 May 2012 (UTC)[reply]

Did you mean

y''-xy'=0

? Otherwise your substitution looks off. 129.234.53.19 (talk) 21:41, 5 May 2012 (UTC)[reply]

Yes, that is what I meant. 150.203.114.37 (talk) 22:12, 5 May 2012 (UTC)[reply]

Except it actually is

y''-xy=0

. WHOOPS. I'll correct it and come back if I'm still stuck. 150.203.114.37 (talk) 22:14, 5 May 2012 (UTC)[reply]